METHODOLOGY > LIST OF TOOLS
METHODOLOGY
F.W. Hasluck’s “Christianity and Islam under the Sultans” is a very rich, diverse and challenging text that rather resists classification to “machine readable” formats and categories. Despite Hasluck’s detailed, field-based descriptions, there is often a great level of ambiguity that himself highlights in his discussions. Also, his writing and spelling reflect his historical and intellectual context of the early 20th century and therefore is often not easily translatable to contemporary entities and categories. Some of the main challenges we had to deal with when extracting data from the text that could be mapped or visualized were the following:
- identifying placesNamed locations mentioned by F. W Hasluck in relation to religious sites, groups and practices, monuments or citing other sources. More: Hasluck often refers in the text to sites and place names that no longer exist or exist under a different name. In addition, Hasluck’s spelling of location names often required a correction or conversion to current versions. We used databases of geographic entities (such as geonames.org) that include both contemporary and historic place names to match Hasluck’s description with an actual location. We also collected images for a number of sites mentioned by Hasluck to match textual description.
- specifying dates and eventsAn act of transformation of a site (e.g. creation, conversion, destruction). More: Hasluck often doesn’t provide a date for certain eventsAn act of transformation of a site (e.g. creation, conversion, destruction). More in the history of a site or is rather vague in doing it (e.g. “about 50 years ago”). This makes it difficult to work with the temporal dimension of the text in a systematic way.
- tracing identity: distinguishing between different ethnic and religious groups is often a challenge as Hasluck often uses similar but not identical terms interchangeably (e.g. “Orthodox”- “Greeks”, “Turks – Moslems”). This raised the issue of differentiating between similar ethnic or faith-based groups but also of constructing these groups and entities in the first place. For a more detailed discussion you can read the micro-essay “Visualizing identity in Hasluck’s text”.
In order to deal with these issues we built a basic “data vocabulary” defining the key concepts and entities we were most interested in capturing and documenting.
Here are the main data entities we identified:
- Sites: named locationsActual, geocoded locations of sites or places in latitude/longitude coordinates More listed and categorized by F. W. Hasluck according to his own typology.
- PlacesNamed locations mentioned by F. W Hasluck in relation to religious sites, groups and practices, monuments or citing other sources. More: named location mentioned by F. W Hasluck in relation to religious sites, groups and practicesSecular or religious uses of a site or place (e.g. ritual, pilgrimage) More, monuments or citing other sources.
- LocationsActual, geocoded locations of sites or places in latitude/longitude coordinates More: actual, geocoded location of a site or place documented in latitude/longitude coordinates.
- CommunitiesEthnic groups or faith-based communities performing religious, spiritual or secular practices in, around or in proximity to a site or place. More: ethnic groups or faith-based communitiesEthnic groups or faith-based communities performing religious, spiritual or secular practices in, around or in proximity to a site or place. More performing religious, spiritual or secularTerm associated with secular practices performed by one or different groups in a site or place. More practicesSecular or religious uses of a site or place (e.g. ritual, pilgrimage) More in, around or in proximity to a site or place.
- CoverageSpatio-temporal coverage of a site. E.g. Ottoman Empire, 1400-1850, Asia Minor. More: spatio-temporal coverageSpatio-temporal coverage of a site. E.g. Ottoman Empire, 1400-1850, Asia Minor. More of a site. E.g. Ottoman Empire, 1400-1850, Asia Minor.
- PracticesSecular or religious uses of a site or place (e.g. ritual, pilgrimage) More: secularTerm associated with secular practices performed by one or different groups in a site or place. More or religious uses of a site or place (e.g. ritual, pilgrimage)
- EventsAn act of transformation of a site (e.g. creation, conversion, destruction). More: an act of transformation of a site (e.g. creation, conversion, destruction).
- SacredTerm associated with religious and spiritual practices performed by one or different groups in a site or place. More: term associated with religious and spiritual practicesSecular or religious uses of a site or place (e.g. ritual, pilgrimage) More performed by one or different groups in a site or place.
- SecularTerm associated with secular practices performed by one or different groups in a site or place. More: term associated with secularTerm associated with secular practices performed by one or different groups in a site or place. More practicesSecular or religious uses of a site or place (e.g. ritual, pilgrimage) More performed by one or different groups in a site or place.
We prioritized the “site” as the core entity around which all other concepts and categories would be mapped and documented and we developed a conceptual map of hierarchical categories that would describe a “site” in “Christianity and Islam”. We used this “ontology” of what makes a site to capture, document and semantically group data from the text.
Following the entities and sub-groups defined we processed the text through multiple levels and stages of data clean-up, filtering and structuring:
- merging both volumes of “Christianity and Islam” into a single, plain text file
- organize data by book chapter
- “translate” the data for 259 sites catalogued by Hasluck in Part I (“Tranferences from Christianity to Islam and vice versa”) and Chapter XLII (“Geographical distribution of the Bektashi”) to a database based on our defined groups and entities
- edit and clean-up (e.g. missing parts, spelling errors) original text parts to be included in the database
- geocode, revolve or manually correct the location data for all 259 sites
- identify other place names and types of location included in the text
- prepare “import-ready” versions of the database for mapping and visualization software tools (e.g.proper formats for dates and coordinates).
This process aimed to “translate” the original text to a structured, semantically annotated and geocoded format that could be then digitally read, analyzed and manipulated.
LIST OF TOOLS
Here is a list of software tools and resources used for Visual Hasluck.
Note: a comprehensive list of digital research tools organized by type of use and/or category can be found at the Digital Research Tools (DIRT) directory. For a list of text analysis tools for digital humanities you can check Folgerpedia.
Text analysis
Voyant Tools
Voyant Tools is a web-based reading and analysis environment for digital texts. We used to get an overview of the text and identify, at an initial stage, any patterns or relations.
Antword Profiler is a freeware tool developed by Lawrence Anthony for profiling the vocabulary level and complexity of texts. We used it to produce frequency analysis lists for certain terms in the text.
AntConc is a corpus analysis toolkit for concordancing and text analysis developed by Lawrence Anthony.
Mapping
Unlock text
Unlock Text is a geoparser that can search text hosted on the web in txt or html format for references to locationsActual, geocoded locations of sites or places in latitude/longitude coordinates More and then return results ready for mapping applications. The Unlock Text API provides access to two parsers, the Edinburgh Geoparser from the Edinburgh Language Technology Group and the CLAVIN parser. We used Unlock text to identify locationsActual, geocoded locations of sites or places in latitude/longitude coordinates More in “Christianity and Islam.”
Carto
Carto is a cloud based mapping, analysis and visualization engine that lets users build spatial applications for both mobile and the web.
Visualization
Gephi
Gephi is a powerful software for exploring data through visualization and network analysis.
Tableau Public
Tableau Public is a web service for creating and publishing interactive data visualizations.