Grants:Simple/Applications/Wikimedia Taiwan/2018/Lesson Learn of GLAM

2018年台灣維基媒體協會與李梅樹紀念館合作進行館聯專案時,為了讓專案中被捐贈出的館藏能被進一步應用,我們上傳每筆數位檔案後,都會立刻建置維基數據上的資料。原本我們預期這樣會讓被上傳的檔案更容易以程式進行搜尋和排序,但是隨著專案的進行,我們發現到維基數據的特性,可以使館藏品的資料和所有的知識資料進行對比,進而協助研究者發現過去易被忽略的問題。不過,資料對比的過程也會產生雜訊,這些雜訊應不應該排除、要如何排除,都是未來可以研究的重點。

In 2018, Wikimedia Taiwan held a cooperative GLAM project with Li Meishu Memorial Gallery. To make the media files of museum assets can have further usage, we created the corresponded data on Wikidata after uploading the media files. We suspect the uploaded files can be searched and listed more easily by the applications; what's more, we notice the specialty of Wikidata can allow us to compare the media files of museum assets with all existed databases to help researchers find the issues they miss. However, there are some bias and noises in the processes of data comparisons. The ways to indicate keeping these noises or not and how to remove the bias are potential research targets in the future.

我們認為這個發現給我們帶來兩個啟示:

  1. 過去爭取館聯合作時,我們只強調這有助於館內既有知識的展示;之後可以描繪出如何協助館聯組織生產新知識的圖像。
  2. 過去推廣維基數據編輯時,只著重在維基數據本身的操作與應用;然而比對維基數據和其他有專門主題的資料庫之間的差別,更能突顯維基數據的優勢。

This finding gives us two inspirations:

  1. We can assist museums to produce new linkages of knowledge.
  2. We can compare Wikidata with other databases of specific themes to point out the advantages of Wikidata.

背景/Background

edit

內部:台灣的維基媒體運動/ Wikimedia Movement in Taiwan

edit

台灣維基媒體運動與館聯機構的合作,在2018年以前經驗是相當缺乏的。我們曾經在小部分的聚會中嘗試以博物館的知識為主題,然而成效不好。館聯機構不太了解維基媒體運動的期待,所以當我們提出希望洽談合作的要求,館內工作人員通常都誤以為是要預約導覽。因此我們只能舉辦拍照上傳維基共享資源的活動,上傳的照片也只能用在介紹該機構的條目中,實用性不高。

Before 2018, Wikimedia Taiwan had little experience cooperating with GLAM institutes. We have attempted to introduce "museum" as the topic in several local meetings, but they did not turn out very well. Meanwhile, the GLAM institutes did not fully understand Wikimedia movement's expectation. When we proposed cooperation, they often mistook us as guests who intended to book reservations. The only thing we managed to hold was an event to take photos and submit them to Wikimedia Commons. The photos can only be used in articles of those institutes, which is not super useful.

2017年底,很幸運地,李梅樹紀念館主動來找台灣維基分會接觸。他們資訊組的志工嘗試讓紀念館更開放,過去就曾將網頁源代碼公開出來。找上維基的原因,是想要利用維基百科的開放及影響力,達成館方的𠥊期策略目標:推動台灣的美術教育。於是,台灣分會首次將館聯計畫列入2018年度工作項目中,然而很謹慎地預估,這個專案只要能找到長期合作的伙伴,做出一些館聯專案的經驗,就算很有收穫了。

Fortunately, in late 2017, Li Meishu Memorial Gallery took the first step to contact Wikimedia Taiwan. The volunteers in their informatics department tried to make the museum more open to the public. As part of their efforts, they have open-sourced their webpages. To achieve the museum's goal to promote art education in Taiwan, they turned to Wikipedia. Wikimedia Taiwan listed the GLAM-cooperation project in its annual plan in 2018 for the first time. Based on our careful estimate, the project is rewarding as long as we successfully find long-term partners and gain some experiences in GLAM-cooperation projects.

雙方很快決定,第一年的合作專案就是要將館藏畫作的數作檔捐贈到維基共享資源。基於希望接觸更多人群,達成美術教育的目的,我們一開始就排除用機器人上傳的方法,決定以辦工作坊、邀請志工來上傳的方式進行。在設計工作坊的流程時,考慮到只要求志工上傳圖片可能十分無趣也缺乏教育性,要求他們為每幅畫作編輯維基百科也相當困難(因為文獻缺乏),所以台灣分會提出為畫作建立維基數據的想法。在此之前,我們雖然知道維基數據,也嘗試做過一些編輯,但是很少設計在推廣活動中。這個決定讓維基數據跟館聯計畫同步成為2018年,我們在推廣工作上一個全新的冒險。

Both sides quickly decided that the cooperation project for the first year is to upload the digital images of the paintings in the museum to Wikimedia Commons. In the hope of reaching more people to achieve the goal of art education, we did so by holding workshops and inviting people to upload, instead of doing it automatically by a robot. When designing the workshop, we felt that asking the volunteers merely to upload photos could be quite boring and less educational, while asking them to edit articles about those paintings on Wikipedia would be too challenging due to a lack of references. Therefore, Wikimedia Taiwan proposed the idea to ask the volunteers to edit the Wikidata page for the paintings. Prior to this attempt, although we were aware of Wikidata and have tried to make several edits, we seldom combined Wikidata with promoting activities. This decision made Wikidata and GLAM-cooperation project bold attempts in our promoting activities in 2018.

外部:台灣的館聯機構簡介/ Introduction of Museums in Taiwan

edit

台灣的館聯機構主要由三種組織構成:第一種是由政府機關設立的,第二種是學校附屬設施,第三種則是由非營利的基金會經營。

There are three types of museums in Taiwan:

  1. Setting up and supporting by government
  2. Attached departments of universities
  3. Running by NPO

政府機關建立的館聯機構相當多元。由中央政府管理的往往成為文化、觀光、科學的亮點,例如故宮博物院是國際人士來台都常參觀的景點、國家圖書館則幾乎藏有台灣所有的出版品;地方政府管理的機構就比較著重實用性,例如市立圖書館大多在較小的行政區設立藏書量較少的分館,免費開放給市民的閱讀空間和講座比藏書量更受到市民喜愛;又如一些眷村、客家、原住民聚落會設立一些迷你文化博物館,通常只有展覽簡單的圖片、文字和模型,目的在吸引假日家庭旅遊式的遊客,帶動附近的商業活動。

GLAM institutes run by governments are very diverse. Those run by the central government often become cultural, tourist and scientific hotspots. For instance, the National Palace Museum has visitors from all over the world, while the National Library possesses almost all publications in Taiwan. On the other hand, those run by local governments value practicality more. Local library branches in smaller districts usually have fewer books, as an open space for citizens to read and lecture events are given more weight. Some military dependent villages, Hakka villages or aboriginal tribes will run small cultural museums, which often display simple images, texts, and models in order to attract family visitors on holidays and to drive commercial activities in the region.

學校的館聯設施中,圖書館、檔案館通常只提供校內人士使用,然而有些大學科系會設立開放參觀的博物館。這些博物館中,有些是基於學校本身具有久遠的歷史或某些特色,整理展出有助建立學校的聲譽,例如台大做為台灣第一所大學,森林系、考古人類學系、歷史學系等都有他們自己的博物館,展示自台北帝國大學時期開始收藏的藏品。另一些博物館主要目的是讓學生練習將所學知識轉成展覽的形式呈現出來,所以規模不大,展品以圖片、文字和模型組成,雖然開放參觀,但很少著力經營。

In schools, libraries are often restricted to school-affiliated personnel. However, some universities also run museums open to the public. Some of these museums serve to establish a reputation for the school based on its history or features. As the first university in Taiwan, NTU has museums in its department of forestry, anthropology, and history to showcase precious exhibits, some of which can be traced back to the Japanese colonial period. Other museums serve to provide platforms for students to display what they have learned. These museums are often small and display only images, texts, and models. Although some may be open to the public, they are rarely run with full efforts.

非營利基金會經營的館聯機構型態跟品質都不一,博物館大部分都會被視為遊憩景點經營,圖書館則通常是以服務社區居民的公益目的為主;美術館通常最講究展場設計跟展覽規畫,有些帶有媒合藝術交易的功能。

GLAM institutes run by non-profit foundations have different forms and qualities. Most museumes serve as tourist spots, while most libraries mainly serve local residents in the purpose of public welfare. Art galleries usually emphasize design and planning the most, some of which also provide platforms for the art trade.

台灣大型館聯機構資訊化的腳步大多很緩慢,大多只架設一個網站展示自我介紹和活動訊息,近期會經營社群網站;圖書館因為業務需求,會針對藏書做出檢索、查詢、預約、借還的系統;除了大型的公立館聯機構,很少有對外公開的館藏資料庫系統。大型的公立館聯機構較有資源可以做出資料庫系統,不過在很長一段時間,都基於種種理由限制使用。以國立故宮博物院為例,僅管該館收藏文物應該皆屬公有領域,但是該館在2017年7月才首次釋出免費、不限用途的數位圖檔,不過仍只提供低階且含浮水印的圖檔。

Unfortunately, large GLAM institutes in Taiwan provide little informatics resources. Most of them only maintain a website showing basic information and upcoming events. Recently, some started to build community websites. For practical purposes, libraries design systems to look up, reserve, borrow and return books. However, other than big public GLAM institutes, few have their databases open to the public. While big public institutes have more resources to build databases, their usage is typically restricted. Take the National Palace Museum for example. Although almost all of its artworks belong to the public domain, it didn't release digital images (of low quality and with watermarks on them) for free, unrestricted usage until July 2017.

許多館聯機構會參與「國家數位典藏計畫」以爭取來自台灣政府的補助經費(通常由科技部、文化部或教育部給予),在過程中接觸到館藏數位化的概念和實務。但這個計畫成果在台灣往往受到四種質疑:第一是數位化成果通常會被要求上傳到補助單位自建的資料庫中,而該資料庫的使用者介面不好用;第二是負責提案跟執行的館聯機構缺乏數位素養,以致於數位化後的資料仍以呈現在網頁上供人閱讀為考量,難以供程式運用;第三是計畫之間缺乏統合,每個計畫都提出了自己的資料欄位,資料庫之間難以比對;第四是計畫缺乏永續性,一旦補助到期而沒有申請到下次的經費,相關內容就停止更新,甚至不再有人管理。

Many GLAM institutes participate in National Digital Archives Program (NDAP) to claim funds from the Taiwanese government (often provided by the Ministry of Science and Technology, the Ministry of Culture, and the Ministry of Education), and start to learn about the concept and practice of digitalization. However, this program faces four questions. First, the outcomes for digitalization are often asked to be uploaded onto the database built by the funding body, which may not be very user-friendly. Second, the institute may lack knowledge about digitalization, making the digitalized data still displayed for people to read, instead of for computers to utilize. Third, these projects are not organized in a coherent way. Every single project proposes its own information fields to fill, making it hard to compare among different databases. Lastly, these projects are not designed to be run in longterm. Once a project fails to get renewal funding, updates (and even management) of its database stops.

「國家數位典藏計畫」與維基推廣的關係十分曖昧。過去有些維基百科編寫的計畫與該計畫下的專案高度相關,因此專案的主持人在提出補助案申請時,會邀請台灣分會參與。其中一些人會向我們傳達負面訊息:補助單位基於績效指標,不希望他們的活動是增加維基百科而不是該計畫專屬網站的內容(儘管主持人同意維基媒體計畫的影響力大很多);但也有些主持人會說,將維基納入計畫,在審查會議中被贊許是有創意的想法。

The relation between the National Digital Archives Program (NDAP) and Wikimedia promotion is quite ambiguous. In the past, some Wikimedia projects are highly correlated to projects under NDAP. Therefore, people in charge of these projects invite Wikimedia Taiwan members when they apply for the funding. Some of them reveal negative messages to us: the funding body doesn't want its events to increase contents on Wikipedia instead of the website of its own project (even though some of them agree that Wikimedia has a much bigger impact). However, some people also say that in review meetings, making Wikimedia a part of their projects is regarded as a creative practice.

過程/Process

edit

計畫執行方式/The Ways to Run GLAM Projects

edit

台灣分會和李梅樹紀念館合作的計畫中,標準的工作坊流程是按以下步驟進行:

  1. 參與者註冊維基媒體帳號,並且登記到 Program & Events Dashboard 。
  2. 李梅樹紀念館展示該次活動希望上傳到維基的畫作數位檔,並由館方人員講解與李梅樹和該次預計上傳畫作相關的藝術知識。
  3. 參與者每人挑選一張畫作,跟著維基講師的教學,將其數位檔上傳到維基共享資源。
  4. 參與者跟著維基講師的教學,為剛才上傳的畫作建立維基數據項目。
  5. 參與者應用前兩項流程中學到的步驟,自行將剩下的畫作上傳並建立資料。
  6. 填寫問卷、說明維基數據未來的可能應用、Q&A。

在設計維基數據的教學時,我們先查詢了一些全球知名畫作的維基數據項目,最後選定以李奧納多·達文西(Leonardo da Vinci)的作品「蒙娜麗莎」(La Gioconda)為我們的範本,請參與的志工輸入畫作的尺寸、作者、材質、創作年份和畫中物品等數據。

有一些活動會有特殊的規畫,例如跟台北大學歷史系合辦的活動,是以編寫李梅樹和劉清港(李梅樹兄長)的維基百科條目為主,但是與本篇主題無關,在此就不說明執行細節。

Here are the standard steps for workshops of the GLAM project between Wikimedia Taiwan and Li Meishu Memorial Gallery:

  1. Asking participants to sign up Wikipedia-user accounts and register to Program & Events Dashboard 。
  2. The workers/volunteers of Li Meishu Memorial Gallery will present the media files of paintings that they plan to upload to Wiki in the event, explain all the process and the related knowledge for the paintings.
  3. Participants pick up the paintings (one per person) and upload them to Wikicommons by following the guidance of the speaker.
  4. Following the guidance, participants create the Wikidata items for the uploaded paintings and key in the information we need: the size of the painting, creator, material, year of creation and so on.
  5. Participants uploaded the paintings and create the Wikidata items by themselves.
  6. Answering the questionnaire, discussing the potential application for Wikidata and the Q&A time.

We take global-known painting's Wikidata information as standard. The standard we chose before starting to design the guidance is Leonardo da Vinci's painting:"La Gioconda".

We will modify the steps based on the purposes of different activities; for example, we focus on article editing on Wikipedia, not Wikidata, on the cooperative activity with the Department of History, National Taipei University.

問題的發現/ Questions

edit

我們在前幾次工作坊執行完之後,覺得原先的教學設計十分成功,因為大部分的參與者都能完成檔案上傳跟數據編寫的工作,館藏上傳的速度也與預期相符。不過隨著次數變多後,有兩個問題反覆出現,讓我們意識到需要去解決。這兩個問題是:

  1. 如何正確輸入我們想要輸入的資料值?這個問題又分成三個子問題,
    1. 有些值應該被輸入,但找不到對應的維基數據項目。例如我們原本希望在標籤「材質」中加入「畫布」的值,但是以中文輸入「畫布」時卻沒有這個項目,只好暫時以「布料」(Cloth)取代。
    2. 有些值輸入之後出現「無效」的資訊,但我們不知道判定的規則是什麼。
    3. 有些資料沒有適合的標籤可以被輸入,例如油畫常會使用"French standard sizes"表示尺寸,但維基數據無此標籤;另外台灣廟宇常會祭祀多個神明,其中一個會被列為「主神」,這項資料在維基數據也沒有這個標籤。
  2. 這些數據建置之後怎麼樣被應用?台灣維基分會有提出過往的使用案例提供給李梅樹紀念館參考,但是這些案例是用來搜尋(例如找出在特定年份的畫作)或者排序(例如將指定的多張畫作依一定順序排列),一般的館聯機構若有著手進行資料庫建置的(無論是館內自有或是國家文化記憶庫提供的),都有類似的功能;缺少更具吸引力的願景。

在意識到問題的存在後,我們做了簡單地分析。這兩個問題的根源,都來自於我們對維基數據本身的了解不足,然而它們之間還是有差異。第一個問題只需要依賴維基內部就可以解決,我們可以試著搜尋說明文件、也可以繼續模仿其他數據完整的項目增加我們所建項目的內容,只要給予足夠的時間,總是可以逐步地完善;而第二個問題牽涉到外部的因素,我們除了要熟悉維基數據本身之外,也得比較館聯機構已經接觸得到的系統,以及他們想要解決的問題,才能夠想像出一個有創意又有足夠吸引力的使用方式。於是,第一步解決問題的嘗試,就是從熟悉維基數據跟了解合作單位需求開始。

We noticed there are two major questions popped up repeatedly through all the workshop:

  1. How to key in the values to correct labels? This question can be divided into three minor questions:
    1. Some vital information cannot find the corresponding value. For example, we want to add canvas to the label "Material" but there is no value name "canvas" in Chinese. We currently use "cloth" to represent "canvas" in Chinese Wikidata.
    2. “Invalid” signals showed up after we key in the values; however, we don't know the rules of judgment.
    3. Suitable labels are lacked in Wikidata. (eg: the "French standard sizes" for oil painting and the "Major God" for Taiwanese temples)
  2. How to use these data after created them? There are similar search functions between museums' search engine and Wikidata which make the project become less attractive. We have to find the advantages and solid purposes for our cooperations.

We did a simple analysis to investigate the original of these two questions and found out that we might lack the understanding of Wikidata. The first question can be solved by adding more information to Wikidata and make some clear explanations. To solve the second question, we need to get familiar with Wikidata and understand what the museums want to figure out a creative and attractive cooperation proposal. Therefore, the first step of question-solving starts by investigating Wikidata and the needs of cooperative partners.

問題的解決/ Solutions

edit

深入了解維基數據的操作規則並不困難。我們在專案開始前,就透過參考其他畫作的維基數據項目,找到了我們上傳的畫作可以使用的標籤和應該填入的內容。所以,透過參考更多藝術品的項目,我們發現的可用標籤就越來越多,對於數據的理解也更加正確。此外,透過關鍵字搜尋說明文件,也是很基本的學習方式。我們透過這個找到了提議新標籤的頁面,解決了第一個問題中的第三個子問題。

It‘s not difficult to further investigate the method of Wikidata usage. We looked into the Wikidata items of other paintings to find the labels and contents we need to fill in for the paintings we want to upload to Wikicommons before the project started. The more labels we found, the more correct understanding of data we have. Besides this, we found the pages for the proposal of new labels by searching keywords to solve the third minor question of the first major question.

至於第二個問題,預想的解決方案並不順利。台灣的館聯機構普遍難以直接提出他們科技上的需求。即使李梅樹紀念館與我們合作的工作人員屬於資訊組,都是數位時代成長的年輕人,也長期在開源社群活躍,然而一開始也難以敘述出館方的科技需求。他們提出的願望清單,可能是「提升台灣美術教育」、或者「讓更多人知道怎麼看畫」這類比較模糊的任務。儘管我們在對話中也想出類似「透過資料庫,可以讓電腦自動將數幅畫作按年代排序,可以方便教學」的點子,但這類點子使用台灣政府力推的國家記憶資料庫就可以完成,維基數據沒有明顯地優勢。有些點子又牽涉到高深的人工智慧技術,以現有的資源毫無可行性,例如「輸入50張畫的數據讓程式學習,之後即使還沒有輸入資料的畫作,程式也能辨認是在哪個時期創作的。」我們會需要維基數據相對台灣既有數據庫更具優勢,並不是出於對維基的偏愛。而是因為如果不能找到優勢,無法說服台灣的政府單位讓我們將資料建於維基而非自有資料庫。然而政府資料庫的著作權政策有佔有作品權利的疑慮,其次往往在計畫結束後失去預算維護。館聯機構因此會期待我們能提出具有技術潛力的方案,使他們更容易說服補助案的審查委員同意將數據建置於維基數據上。

When it comes to the second question, the GLAM institutes in Taiwan face a difficulty to express their needs precisely. The volunteers from Li Meishu Memorial Gallery are young, energetic and active in open sources community; however, it's hard for them to describe the needs of the museum. They figure out some ambiguous aims on their wishing list such as " evaluating Taiwanese art education" and "Let more people know how to appreciate paintings". We also discuss the ideas that are similar to what the National Digital Archives Program did. Some ideas need Artificial Intelligent to make them come true that is beyond our resources. In other words, we need to make Wikidata have more advantages than local databases in Taiwan to convinced the Taiwanese government to allow us to upload and create materials on Wikidata, not their own databases. The copyright policy for governmental database makes it have some considerations of the assets; what's worse, it might lose budget to maintain the database after the project ends. GALM institutes hope Wikimedia Taiwan can figure some solution to assist them to convince the committee members from the government to create the items on Wikidata.

最終這個問題的解決,是因為第一個問題的解決而意外發現的。我們在研究針對「藝術家」項目中可以使用的標籤進行研究時,發現到了「老師」(Property:P1066, 英文項目名稱為 Student of)這個項目,於是李梅樹紀念館的工作人員,將數名台灣藝術家可查的老師都輸入維基數據中,在工作的過程裡,他們意外地發現到一個明顯、但過去常被忽略的事情。原來據李樹梅子女的口述,一位攝影師彭瑞麟(Q11069923, Phêⁿ Sūi-lîn)過去常造訪李家,然而在現存的文獻紀錄中,都沒有這兩個人交流的紀錄。因為缺乏旁證,許多研究者並不完全相信這個資訊。然而,他和李梅樹其實都在赴日本求學時,是石川欽一郎(Q3847009, Kinichiro Ishikawa)的學生。

We find the solution for the second question by an unexpected finding: we notice there is a linkage between a photographer Phêⁿ Sūi-lîn (Q11069923) with Li Meishu under the label "student of" (Property:P1066) that points out they were the students of Kinichiro Ishikawa (Q3847009) while they studied in Japan by keying in the teachers' names of Taiwanese artists to Wikidata. There is no existed paper record of these two people had any interaction besides the oral description of Li's family members.

這項資料在文獻中是存在的,但是為什麼過去會被忽略,但在我們製作維基數據的過程中被發現呢?我們再次檢視維基數據跟其他文獻和資料庫之間的差別,歸納出三點:

  1. 一般文獻如雙方的傳記、史料,因為是以長篇文字呈現,這個資訊夾雜在其中,很難引起人們的注意與聯想。而在維基數據中,我們輸入資料的過程就會意識到同個標籤下填入了同一個值的情況。
  2. 台灣文化機關或館聯機構建置的資料庫,甚少為畫家的老師列為出一個欄位,所以過去也沒有因為數位化的工作而注意到這件事。而在維基數據中,即使我們一開始也沒有加入這個標籤,但是因為每個項目中的標籤是可以增減的,所以我們有機會在後來注意到應該加入這方面的訊息。
  3. 就算自建的資料庫有列出師承項目,可能也很難發現兩者的關係。因為李梅樹是畫家、彭瑞麟是攝影師,可能會分別出現在專屬於畫家和攝影師的資料庫中,不會在同一個資料庫裡出現。建立收錄範圍更廣的「藝術家」資料庫可能有機會讓資料建置人員留意到這件事,但是因為這樣的資料庫需要更多的資源投入,這樣的建置計畫實際上不太可能出現。維基數據可以收錄人類所有的知識,才可以讓我們有機會發掘更多資料之間的相關性。

We investigate why this linkage is ignored for so long but can be found in Wikidata, We thus compare Wikidata with other databases and come out three major differences:

  1. The paper records such as biographies are usually presented as long sentences. The information exists in the sentences but hard to be noticed by people and linked to other data. Wikidata can solve this problem by finding the same values in the same labels.
  2. Comparing with other databases, Wikidata has high flexibility let allow us to add more labels to the items.
  3. Wikidata can find the linkage between different fields' databases which is almost an impossible mission for other databases.

這個發現讓我們意識到,除了透過數據的值來做搜尋跟排序,「連結」也是維基數據很重要的功能之一,並且它有兩個地方勝過台灣本地的資料庫:

  1. 維基數據收集的內容不限主題,台灣本地的資料庫多半會有一定的主題限制;所以維基數據有許多連結,無法在台灣本地的資料庫中被找到。
  2. 維基數據中每個項目的標籤可以不斷增添,蒐集的資料具有彈性;台灣本地的資料庫每筆項目該填入什麼內容多在開始時就設計好了,設計時沒有想到應該加入的項目,日後要增補就很難,無法隨資料量的增長找到新的連結。

This finding aware us that, besides searching and listing data by the values, data linkage is a vital function of Wikidata. Comparing with the local databases in Taiwan, Wikidata has two advantages:

  1. There is no theme-limitation in Wikidata. The local databases in Taiwan have theme-limitation which makes them hard to find the linkages and compete with Wikidata.
  2. The local databases in Taiwan set up all the items in the beginning and cannot be modified. That makes the users of local databases hard to add the items they need after the databases are created. Comparing with that, Wikidata allows users to add new labels to the database, which tell the flexibility of Wikidata.

於是第二個問題有了解答:維基數據越完整,我們就越能透過程式去搜尋項目與項目之間可能產生關係的路徑,未來就可以仰賴機器,注意到更多人類較容易忽略的細節,讓館聯機構找到更多的研究材料。所以,我們在2018年年末擬出一個程式專案,預計在未來試著開發一項工具,可以搜尋任兩個維基數據項目之間可能的關聯。

There is an answer for the second question: the more complete Wikidata is, the easier we can find the potential linkage between items to discover some left issues by machines to assist museums to find more research resources. Therefore, we come out a draft of Program project at the end of 2018 that we hope we can develop a tool to find the potential linkages between Wikidata items in the future.

不過初步測試之後,我們碰到了新的難題:在搜尋兩項資料的關聯性時,出現一些雜訊(例如一張台灣畫家跟一個法國攝影師,被爬出來的關係可能出現「兩個人的作品都曾被同一型號的機器掃描成數位檔」)。這些雜訊初步看起來是種無意義的巧合,其數量會不會多到干擾研究者?有沒有辦法排除?或是其實他們也有一定意義,不應該被視為雜訊?這些問題都還需要再深入探索。但無論如何,這個發展方向讓我們推動維基數據時,能夠描繪出更動人的願景。

We noticed that there are some noises while we search for the linkage between the two items. (ie, There is a linkage between one Taiwanese painter and one French photographer "Their creations were scanned by the same model of a machine") These noises seem like a meaningless coincidence in the first glance, we can not identify the noises will distinguish researchers and should be removed or they are meaningful. We need further investigation of these questions; however, the questions we face also point out some directions for us to promote Wikidata and figure out new prospects.

心得/What We Have Learned

edit

根據前面的故事,我們在今年的館聯合作之中學到以下課題。

According to previous experience, we learned the following lessons from GLAM project 2018:

館聯專案另一種合作可能/ Another Cooperative Formation of GLAM Project

edit

目前在Outreach Wiki上,館聯專案的模型專案分為兩大類:「分享數位館藏」和「分享知識」,而我們在思考,有沒有可能加上第三大類:串連與發掘知識。

There are two major categories for the GLAM project on Outreach Wiki: "sharing the media files of museum assets" and "sharing knowledge". We are wondering if there a possibility to add on the third category: connecting and discovering knowledge.

「分享數位館藏」類的專案有助於散佈館聯機構內有形的藏品、「分享知識」類的專案則可以把館聯機構研究用的資源轉化成開放知識。這些都有助於擴大館聯機構的影響力,推廣他們已知的內容。然而,我們透過這次合作發現館聯專案不一定要要求合作伙伴分享,其實也可以主動把維基媒體上廣大的內容分享給館聯機構。儘管館聯機構的研究人員可能已經知道查詢維基百科或使用共享資源上的檔案,但維基社群或許可以利用對維基媒體各項計畫的了解,主動告知館聯機構怎麼使用我們的內容產生新的知識或收藏,甚至嘗試開發工具協助。

The projects of "sharing the media files of museum assets" is aimed to share the solid museum assets in museums, and the ones of "sharing knowledge" can turn the academic resources of museums to open knowledge. These projects can enlarge the influence of museums and promote the information they already knew. However, we found that we can not only ask our cooperative partners to share their media files but also share the Wikidata to museums. Wiki communities can use their understanding of Wikimedia projects to guide museums on using Wiki contents to generate new knowledge or collections, even developing new tools to assist themselves.

我們不認為台灣分會是第一個發現這件事的組織,然而,過去這一類的發現可能沒有被特別的突顯。也許是因為其他地方推廣館聯專案時,單是展示館內現有的知識,對潛在的合作伙伴就很具吸引力。而台灣因為有政府資助的自建資料庫競爭,這個發現對我們的意義特別重大。如果這個發展的方向也能被特別提出來展示,也許其他地方跟台灣類似,以維基平臺展示館內知識不足讓館聯機構感到興趣時,他們也能有另一個吸引伙伴的理由。

We don't think that WMTW is the first organization which discovers this characteristic. However, this kind of findings may not be highlighted. One possible explanation for this situation is: the current achievements, such as showing the well organized and uploaded academic data of the museums, are attractive to potential cooperative partners while we promote our projects. In Taiwan, we have to compete with the database constructed by the government. The advantages of Wikidata can help Wikimedia Taiwan to attract more potential partners in the future.

維基數據的獨特價值/Unique Value and Characteristic of Wikidata

edit

在與李梅樹紀念館合作的過程中,我們更深入接觸其他館藏數位化的方案後,我們認為維基數據有更獨特的價值。過去,我們一直認為維基數據對維基媒體計畫很重要,但理由多半是著重跟其他維基媒體計畫相比,以「數據」方式呈現知識,對於資料處理很重要。換言之,重要的是「知識的數據化」,而不是這個平臺本身。然而現在我們更進一步認為,「維基數據」有個更獨特的價值,就是它有能力將所有的知識放在同一個資料庫裡,而且每個資料項下應該填寫的內容保留極大的彈性。建置資料庫的計畫很少會不訂定任何主題、不擬定資料項目中的欄位,因為這樣內容的蒐集會顯得漫無邊際。然而維基數據可以做到這兩點,進而打破人類思維的盲點,找出知識之間過去沒有被發現到的關聯性。未來在推廣維基數據時,我們會調整內容,在其中加入闡述維基數據跟一般資料庫的比較。透過突顯這個獨特價值的方式,來爭取更多的合作機會。

We learn the unique values and characteristics of Wikidata through the cooperation with Li Meishu Memorial Gallery. In the past, we thought Wikidata is important for wiki-related projects by using digital way to present knowledge. In other words, making knowledge digitalized is more important than the platform itself. We now recognize Wikidata have unique value on gathering all the information and knowledge together in the same database and left the flexibility for all the items. Wikidata have high flexibility on the themes and labels which can help us find the linkage between knowledge that we didn't discover before. We are going to emphasize the advantages of Wikidata in future promotion to seek more cooperative opportunities.