| Abstract: |
The pervasive presence of digital technologies in children's life has led to a sharp increase in the dangers faced by children (including cyberbullying, exposure to exploitation with undesirable content, grooming online, addiction and violations of privacy). Purely text- or image-based approaches to online safety have been largely inadequate to contain the complexity of these issues. This work reports the development; testing and empirical validation of a multimodal artificial intelligence (AI) framework to improve children aged 6–17 digital wellbeing and online safety. We introduce a novel late fusion multimodal architecture, and combine NLP, Computer vision and audio analysis to produce a score of 94.7% F1-score overall on the test dataset consisting of 42.8 K annotated instances. A mixed-methods approach: structured parental surveys (n = 412), analyses of behavioral logs, content identified over multiple platforms was used to collect data. Multimodal fusion is shown to provide a strong improvement over all baseline unimodal baselines in predicting the threat across all categories. The study found that 70.0% of parents would accept an AI-based safety tool; however, privacy concerns represented a substantive barrier to adoption. Analysis of relationships again demonstrated a strong negative relationship between AI based online risk indicators and children's subjective wellbeing scores. |