A A+ A++

Hybrid system

of multi-modal signal acquisition and processing in the analysis of sigmatism in children
Multi-modal Polish child speech dataset

is publicly available in accordance with the principle of "as open as possible, as closed as necessary" (Polish National Science Centre principle). To access the data you can submit a written request to the Principal Investigator. It should contain the justification for the need for access and the applicant's data. The Project Manager reserves the right to refuse access in justified cases, such as the need to protect the privacy of study participants, data confidentiality or intellectual property rights.

Data on articulation, acoustics, and visual appearance of the articulators

including normal and distorted child speech (focused on sigmatism). We collected data in six kindergarten and school facilities in Poland during the speech therapy examinations of 201 children aged 4-8. Material includes 15-channel spatial audio signals and a dual-camera stereovision stream of the speaker's oral region, as well as speech-therapy diagnosis. The data record comprises audiovisual recordings of 51 words and 17 logotomes containing all 12 Polish sibilants and the corresponding speech therapy diagnoses from two independent speech therapy experts. In total, we gathered 66 781 audio-video segments, including 12 830 words and 53 951 phonemes (12 576 sibilants).

Data organization

The structure of the database (main folder) is shown in the picture below. Each participant has a separate folder with the audio, video, and speech diagnosis data. Folders of speakers are named 00XXX, where XXX stands for the anonymized three-digit ID of a participant. The database includes also five CSV files with dataset specifications, and a PDF file presenting the diagnosis dictionary.

To get more details regarding the participants and the language material, you can access two summaries below. The csv file named participantSummarygathers the anonymized data of the children examined (including age, sex, folders and files structure, etc.), while the file segmentSummary describes all audio-visual segments available in the dataset (including words, logatomes, phonemes; for each file, a quality validation is ensured).

participantSummary.csv Download
segmentSummary.csv Download
Contact us for further information
Contact us for further information

dr hab. inż. Paweł Badura, prof. PŚ

Principal Investigator

dr hab. inż. Paweł Badura, prof. PŚ

Principal Investigator

dr inż. Zuzanna Miodońska

Head of Acoustic and Visual Research

dr hab. inż. Paweł Badura, prof. PŚ

Principal Investigator

dr inż. Zuzanna Miodońska

Head of Acoustic and Visual Research

dr Joanna Trzaskalik

Head of Speech-Language Pathology Research

dr Joanna Trzaskalik

Head of Speech-Language Pathology Research

dr inż. Michał Kręcichwost

Chief Biomedical Engineer

dr inż. Michał Kręcichwost

Chief Biomedical Engineer

mgr inż. Agata Sage

Biomedical Engineer

mgr inż. Agata Sage

Biomedical Engineer

mgr Ewa Kwaśniok

Speech-Language Therapist

mgr Ewa Kwaśniok

Speech-Language Therapist

dr inż. Zuzanna Miodońska

Head of Acoustic and Visual Research

dr Joanna Trzaskalik

Head of Speech-Language Pathology Research

dr inż. Michał Kręcichwost

Chief Biomedical Engineer

mgr inż. Agata Sage

Biomedical Engineer

mgr Ewa Kwaśniok

Speech-Language Pathologist

© Silesian University of Technology

General information clause on the processing of personal data by the Silesian University of Technology

The authors - the organizational units in which the information materials were produced, are fully responsible for the correctness, up-to-date and legal compliance with the provisions of the law. Hosted by: IT Center of the Silesian University of Technology ()

Data availability statement

„E-Politechnika Śląska - utworzenie platformy elektronicznych usług publicznych Politechniki Śląskiej”

Fundusze Europejskie
Fundusze Europejskie
Fundusze Europejskie
Fundusze Europejskie