METHODOLOGY > LIST OF TOOLS

METHODOLOGY

F.W. Hasluck’s “Christianity and Islam under the Sultans” is a very rich, diverse and challenging text that rather resists classification to “machine readable” formats and categories. Despite Hasluck’s detailed, field-based descriptions, there is often a great level of ambiguity that himself highlights in his discussions. Also, his writing and spelling reflect his historical and intellectual context of the early 20th century and therefore is often not easily translatable to contemporary entities and categories. Some of the main challenges we had to deal with when extracting data from the text that could be mapped or visualized were the following:

  • identifying places: Hasluck often refers in the text to sites and place names that no longer exist or exist under a different name. In addition, Hasluck’s spelling of location names often required a correction or conversion to current versions. We used databases of geographic entities (such as geonames.org) that include both contemporary and historic place names to match Hasluck’s description with an actual location. We also collected images for a number of sites mentioned by Hasluck to match textual description.
  • specifying dates and events: Hasluck often doesn’t provide a date for certain events in the history of a site or is rather vague in doing it (e.g. “about 50 years ago”). This makes it difficult to work with the temporal dimension of the text in a systematic way.
  • tracing identity: distinguishing between different ethnic and religious groups is often a challenge as Hasluck often uses similar but not identical terms interchangeably (e.g. “Orthodox”- “Greeks”, “Turks – Moslems”). This raised the issue of differentiating between similar ethnic or faith-based groups but also of constructing these groups and entities in the first place. For a more detailed discussion you can read the micro-essay “Visualizing identity in Hasluck’s text”.

In order to deal with these issues we built a basic “data vocabulary” defining the key concepts and entities we were most interested in capturing and documenting.

Here are the main data entities we identified:

  • Sites: named locations listed and categorized by F. W. Hasluck according to his own typology.
  • Places: named location mentioned by F. W Hasluck in relation to religious sites, groups and practices, monuments or citing other sources.
  • Locations: actual, geocoded location of a site or place documented in latitude/longitude coordinates.
  • Communities: ethnic groups or faith-based communities performing religious, spiritual or secular practices in, around or in proximity to a site or place.
  • Coverage: spatio-temporal coverage of a site. E.g. Ottoman Empire, 1400-1850, Asia Minor.
  • Practices: secular or religious uses of a site or place (e.g. ritual, pilgrimage)
  • Events: an act of transformation of a site (e.g. creation, conversion, destruction).
  • Sacred: term associated with religious and spiritual practices performed by one or different groups in a site or place.
  • Secular: term associated with secular practices performed by one or different groups in a site or place.

We prioritized the “site” as the core entity around which all other concepts and categories would be mapped and documented and we developed a conceptual map of hierarchical categories that would describe a “site” in “Christianity and Islam”. We used this “ontology” of what makes a site to capture, document and semantically group data from the text.

Haji Suleiman Baba

Structuring data: Haji Suleiman Baba tekke

Following the entities and sub-groups defined we processed the text through multiple levels and stages of data clean-up, filtering and structuring:

  • merging both volumes of “Christianity and Islam” into a single, plain text file
  • organize data by book chapter
  • “translate” the data for 259 sites catalogued by Hasluck in Part I (“Tranferences from Christianity to Islam and vice versa”) and Chapter XLII (“Geographical distribution of the Bektashi”) to a database based on our defined groups and entities
  • edit and clean-up (e.g. missing parts, spelling errors) original text parts to be included in the database
  • geocode, revolve or manually correct the location data for all 259 sites
  • identify other place names and types of location included in the text
  • prepare “import-ready” versions of the database for mapping and visualization software tools (e.g.proper formats for dates and coordinates).

This process aimed to “translate” the original text to a structured, semantically annotated and geocoded format that could be then digitally read, analyzed and manipulated.

LIST OF TOOLS

Here is a list of software tools and resources used for Visual Hasluck.
Note: a comprehensive list of digital research tools organized by type of use and/or category can be found at the Digital Research Tools (DIRT) directory. For a list of text analysis tools for digital humanities you can check Folgerpedia.

Text analysis

Voyant Tools
Voyant Tools is a web-based reading and analysis environment for digital texts. We used to get an overview of the text and identify, at an initial stage, any patterns or relations.

Antword Profiler

Antword Profiler is a freeware tool developed by Lawrence Anthony for profiling the vocabulary level and complexity of texts. We used it to produce frequency analysis lists for certain terms in the text.

AntConc

AntConc is a corpus analysis toolkit for concordancing and text analysis developed by Lawrence Anthony.

Mapping

Unlock text
Unlock Text is a geoparser that can search text hosted on the web in txt or html format for references to locations and then return results ready for mapping applications. The Unlock Text API provides access to two parsers, the Edinburgh Geoparser from the Edinburgh Language Technology Group and the CLAVIN parser. We used Unlock text to identify locations in “Christianity and Islam.”

Carto
Carto is a cloud based mapping, analysis and visualization engine that lets users build spatial applications for both mobile and the web.

Visualization

Gephi
Gephi is a powerful software for exploring data through visualization and network analysis.

Tableau Public
Tableau Public is a web service for creating and publishing interactive data visualizations.