Memecry: Tracing the Repetition-with-Variation of Formulas on 4chan/pol/
| Contributors |
|
|---|---|
| Publication date | 2022 |
| Description |
Datasets underlying the analysis of the paper "Memecry: Tracing the Repetition-with-Variation of Formulas on 4chan/pol/
This upload includes the following:
- seedwords.csv: A .csv file with terms we used as a seed list to filter for 4chan/pol/-post containing vernacular.
- seedword-network_x.gdf/gephi: .gdf and .gephi network files for NPMI-weighted co-word networks of /pol/-posts. We only included posts that contained one of the aforementioned seed list words.
- twoflow-data_x.xlsx: .xlsx files with data on triplets common to 4chan/pol/. We identified these three-word sequences through the above network files. For example: "gr8 b8 m8", "orange man bad", "lurk moar newfag". The Excel data on these triplet includes: The absolute amount of /pol/-posts per year mentioning the triplets (within a window of five words). The average NPMI scores between the three triplet words per year. The top co-words per year having an average NPMI higher than 0.18 with two of the three triplet words.
- triplets.csv: A .csv file with the extracted triplets, including their common appearance as memetic phrases and a short explanation.
This data was used for "two-flow graphs" available at oilab.eu/formulas/.
See the paper for full explanations on the data.
|
| Publisher | Zenodo |
| Organisations |
|
| Document type | Dataset |
| Related publication | Memecry: tracing the repetition-with-variation of formulas on 4chan/pol |
| DOI | https://doi.org/10.5281/zenodo.7100864 |
| Other links | https://zenodo.org/record/7100864 |
| Permalink to this page | |