Forensic authorship classification by paragraph vectors of speech transcriptions
In forensic comparison, document classification techniques are used mainly for authorship classification and author profiling. In the present study, we aim to introduce paragraph vector modelling (by Doc2Vec) into the likelihoodratio framework paradigm of forensic evidence comparison. Transcriptions...
Elmentve itt :
| Szerzők: | |
|---|---|
| Testületi szerző: | |
| Dokumentumtípus: | Könyv része |
| Megjelent: |
2022
|
| Sorozat: | Magyar Számítógépes Nyelvészeti Konferencia
18 |
| Kulcsszavak: | Nyelvészet - számítógép alkalmazása |
| Tárgyszavak: | |
| Online Access: | http://acta.bibl.u-szeged.hu/75880 |
| LEADER | 01973naa a2200289 i 4500 | ||
|---|---|---|---|
| 001 | acta75880 | ||
| 005 | 20221108114908.0 | ||
| 008 | 220525s2022 hu o 1|| eng d | ||
| 020 | |a 978-963-306-848-9 | ||
| 040 | |a SZTE Egyetemi Kiadványok Repozitórium |b hun | ||
| 041 | |a eng | ||
| 100 | 1 | |a Sztahó Dávid | |
| 245 | 1 | 0 | |a Forensic authorship classification by paragraph vectors of speech transcriptions |h [elektronikus dokumentum] / |c Sztahó Dávid |
| 260 | |c 2022 | ||
| 300 | |a 271-279 | ||
| 490 | 0 | |a Magyar Számítógépes Nyelvészeti Konferencia |v 18 | |
| 520 | 3 | |a In forensic comparison, document classification techniques are used mainly for authorship classification and author profiling. In the present study, we aim to introduce paragraph vector modelling (by Doc2Vec) into the likelihoodratio framework paradigm of forensic evidence comparison. Transcriptions of spontaneous speech recording are used as input to paragraph vector extraction model training. Logistic regression models are trained based on cosine distances of paragraph vector pairs to predict the same and different author origin probability. Results are evaluated according to different speaking styles (transcriptions of speech tasks available in the dataset). Cllr and equal error rate values (lowest ones are 0.47 and 0.11, respectively) show that the method can be useful as a feature for forensic authorship comparison and may extend the voice comparison methods for speaker verification. | |
| 650 | 4 | |a Természettudományok | |
| 650 | 4 | |a Számítás- és információtudomány | |
| 650 | 4 | |a Bölcsészettudományok | |
| 650 | 4 | |a Nyelvek és irodalom | |
| 695 | |a Nyelvészet - számítógép alkalmazása | ||
| 700 | 0 | 1 | |a Beke András |e aut |
| 700 | 0 | 1 | |a Szaszák György |e aut |
| 700 | 0 | 1 | |a Fejes Attila |e aut |
| 710 | |a Magyar számítógépes nyelvészeti konferencia (18.) (2022) (Szeged) | ||
| 856 | 4 | 0 | |u http://acta.bibl.u-szeged.hu/75880/1/msznykonf_018_271-279.pdf |z Dokumentum-elérés |