Using the Database
The “Data“ tab displays all of the data collected for this project.
By default, all languages are shown and all forms (statives, inchoatives, causatives) are populated where available.
Left of the main output chart, there is a section for “Rules” that allows for data filtration by specific relationships. Use the drop-down menus to select the desired relationship (For example, we have highlighted “Causative is derived from inchoative.”) and click “Add rule” to apply the choice. Multiple rules may be added and the data will update in real-time. The output chart to the right will also update as rules are applied. A rule can be removed by clicking the x in the dark gray box.
The “Filters” section has options for selecting particular languages, root classes (broad and/or narrow), and specific roots. These settings are useful if you want to isolate all cooking verbs or all data for Burmese, for example. You may type directly in the boxes to begin filtering through the options or you may click in each box to browse through the full selections. Multiple options may be selected at once and data will update in real-time. If no data appears, the selected filters are too specific to yield results.
When rules and filters are applied to the data, the website URL is updated and unique. Essentially, you can copy/paste the site address and still retain all of your search options.
A section of the data output table is shown below.
The first two columns clarify a row’s particular relationship: First is to Second according to X relationship. All of the possible relationships are shown in the top row’s colored blocks (unrelated, equipollent, transitive, etc.). The relationships are all defined (simply) below. See our Data Collection Guide under “Project Details“ for more detailed information.
- Unrelated: these forms are not directly related in any way.
- Equipollent: the two forms are both derived directly from the same (other) form.
- Transitive: there is a way to get to the second form from the first, but it goes through other forms along the way.
- Labile: these two forms surface in the same lexical form.
- Input: the first form is used to derive the second.
- Derived: the first form is derived from the second form (thus, the second form is input to the first).
This chart, for example, shows that 2.7% of underlying forms are labile with the simple state forms. Further, 84.9% of these underlying forms are used to form the inchoatives (they are inputs to the inchoative forms).
Under each form’s name (in the first two columns), there is a percentage showing how much data exists in the entire dataset for that form. In this example, of the possible 8905 underlying forms (across all languages for all roots we searched for), 84.5% are blank (7526 possible forms were not found in the literature or do not exist). This is perhaps not surprising.
The Collocations column shows how many of each “pair” are attested. This is important to give you an idea of the sample size for the bar graph. In the above example, there are 8905 possible forms, but there are only attested underlying roots AND simple states (instances where both forms are given for a particular paradigm) for 627, or 7.0%.
All data populates below the visualization, as exemplified below.
You can opt to hide/show the forms that we created (“hypotheticals“) based on references like grammars. These are all marked with the @ symbol, as highlighted above. The references can also be toggled on and off.
All columns containing the forms we collected (the underlying form, simple state, etc.) can be hidden by hovering over the name of the column and clicking the blue x that appears in the corner.
Each row (the paradigm for a particular root for a given language) is also accompanied by a diagram describing the relationships between the forms. The diagram key is as follows:
- Unrelated: the circles are simply not connected by any path.
- Equipollent: the circles are connected by a green, double-headed arrow.
- Transitive: there is no overt marking for this relationship, it follows from the input/derived relations.
- Labile: the circles are connected with an orange double line.
- Input/derived: the circles are connected by a blue, single-headed arrow pointing in the direction of derivation.
- Where data is unattested, the circle is faded out.
In the above example, you can see that the underlying form and the simple state are labile. The underlying form is input to both the causative and the result state (as shown by the blue arrows). The double-headed green arrows show that the causative and result state forms are, coincidentally, equipollent. They are both derived from the same form, the underlying root.
The data can be exported as a .csv (comma-separated values) file to be analyzed or modified to your heart’s content. Click the blue “Download” button to export the current data table. Note that if particular filters or rules are applied, you will not be exporting the contents of the entire database. Your search limits the data, thus limiting the data you download.
Other Pages
See the “Project Details“ tab to explore the data collection guidelines and other project-related information.
See the “Languages“ tab to look at all of the languages we investigated, sorted by geographic region and linked to external sites describing each.
See the “Members“ tab to read about the team that made this website possible.
See the “Publications“ tab to view current publications (books, journal articles, etc.) and talks based off of this project as well as the foundational work that contributed to its birth.
See the “Bibliography“ tab to view and download the complete list of sources that were used to compile the data.