Min and Hakka Language Archives Project is one of the five sub-projects of the Language Archives Project (Phase Two) at Institute of Linguistics, Academia Sinica. This project aims to expand the Min language corpora and to establish the Hakka language corpus by archiving long texts, dictionaries and spoken data.

In the Language Archives Project (Phase One), the sub-project “Southern Min Archives: A Database of Historical Change and Language Distribution” compiled a tagged Southern Min text corpus and a linguistic GIS database with maps of language distribution from the perspectives of historical linguistics and language variation respectively. Compilation of the text corpus started from digitalization, tagging, collation and annotations of the earliest Southern Min literature to date, the 1566 version "Lijing Ji" (The tale of the Lichee and the Mirror) to three later versions (i.e., the 1581, 1651, and 1884 versions) and similar drama scripts "Tongchuang Qin Shu Ji" which is about Liang Shanbo and Zhu Yingtai, "Jinhua Nü" (A Girl named Jinhua), and "Su Liuniang" (The tale of Su Liuniang) as well as booklets of Southern Min folk songs written in Chinese characters called "koa-a". Besides Southern Min texts, the corpus also includes a few Hakka folk songs such as “Du Tai Bei Ge” (A Tragic Ballad about Hakka Sailing to Taiwan).

"Min and Hakka Language Archives", which is a follow-up project of the "Southern Min Archives: A Database of Historical Change and Language Distribution”, further expands the scope of Southern Min Archives by archiving Min and Hakka texts written in Chinese characters, Romanized scripts, and in the Chinese-Roman hybrid script systematically. These texts will be tagged and provided with search interface. On-line dictionaries will be compiled. Besides, the language distribution of Min-Hakka bilingual speech communities such as Lun-Bei and Er-Lun rural areas in Yun-Lin County, Hsinpu Township in Hsinchu County as well as Houlong and Nanjhuang in Miaoli County, where Southern Min and Hakka people living together, will be investigated by means of GIS techniques regarding how Min and Hakka languages interact and influence each other.

With regard to the Min Archive, several significant works will be incorporated including Doctrina Christiana en letra y lengua china (published in Manila around 1605), earlier dictionaries such as Chinese-English Dictionary of the Vernacular or Spoken Language of Amoy (Douglas 1873), English and Chinese Dictionary of the Amoy Dialects (Macgowan 1883), A Dictionary of the Amoy Vernacular (Campbell 1913), Taiwanese-Japanese Dictionary (Ogawa 1931, 1932), earlier medical books in the Romanized script "Peh-oe-ji", Taiwan Church News (1885-1969), and contemporary texts and spoken data. On the other hand, earlier Hakka language documents such as Hakka-English dictionary (1926), Hakka-French Dictionary (1926) and missionary works as well as folk tales gathered from Taichung County and Taoyuan County will be archived in order to build a Hakka corpus with interfaces for search and browsing, aiming to compile research-oriented corpora of Hakka phonology, lexicon, texts and GIS information. Meanwhile, the Archive may also contribute to Hakka language teaching and publicity.