Allegro Reviews

Allegro Reviews is a sentiment analysis dataset, consisting of 11,588 product reviews written in Polish and extracted from Allegro.pl - a popular e-commerce marketplace. Each review contains at least 50 words and has a rating on a scale from one (negative review) to five (positive review).

We recommend using the provided train/dev/test split. The ratings for the test set reviews are kept hidden. You can evaluate your model using the online evaluation tool available on klejbenchmark.com.

The dataset can be downloaded from here.

Evaluation

To counter slight class imbalance in the dataset, we propose to evaluate models using wMAE, i.e.macro-average of the mean absolute error per class. Additionally, we transform the rating to be between zero and one and report 1 − wMAE to obtain the final score.

Python implementation of the proposed metric:

import pandas as pd
from sklearn.metrics import mean_absolute_error


def ar_score(y_true, y_pred):
    ds = pd.DataFrame({
        'y_true': (y_true - 1.0)/4.0, 
        'y_pred': (y_pred - 1.0)/4.0,
    })
    wmae = ds \
        .groupby('y_true') \
        .apply(lambda df: mean_absolute_error(df['y_true'], df['y_pred'])) \
        .mean()

    return 1 - wmae

Results

Model	AR Score
ELMo	86.15
Multilingual BERT	83.33
Slavic BERT	84.31
XLM-17	84.52
HerBERT	84.48

License

CC BY-SA 4.0

Citation

If you use this dataset, please cite the following paper:

@inproceedings{rybak-etal-2020-klej,
    title = "{KLEJ}: Comprehensive Benchmark for Polish Language Understanding",
    author = "Rybak, Piotr and Mroczkowski, Robert and Tracz, Janusz and Gawlik, Ireneusz",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.111",
    pages = "1191--1201",
}

Authors

Dataset was created by the Allegro Machine Learning Research team.

You can contact us at: klejbenchmark@allegro.pl

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Allegro Reviews

Evaluation

Results

License

Citation

Authors

About

Releases

Packages

allegro/klejbenchmark-allegroreviews

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Allegro Reviews

Evaluation

Results

License

Citation

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages