- Title
- Multiple kernel fusion for event detection from multimedia data in Twitter
- Creator
- Alqhtani, Samar
- Relation
- University of Newcastle Research Higher Degree Thesis
- Resource Type
- thesis
- Date
- 2017
- Description
- Research Doctorate - Doctor of Philosophy (PhD)
- Description
- Social media is an active space for sharing text and image content about all kinds of events. In particular, Twitter is a platform for instant sharing of information about current events, both planned and unplanned. Twitter provides a large dynamic stream of data on emerging events, such as natural disasters or emergencies. Much of the previous work to analyse Twitter feeds for data containing natural disasters has focused on the text of tweets. A system is proposed in this study for detecting such events using tweets by analysing both their text and images. In this thesis, a system for detecting ‘hot’ events, specifically disasters like earthquakes, bushfires, and wildfires, is proposed. The system uses visual information as well as textual information to improve the performance of detection. It starts with monitoring a Twitter stream to pick up tweets having texts and images, and storing them in a database. After that, Twitter data is pre-processed to eliminate unwanted data and transform unstructured data into structured data. For the feature extraction and representation step, features in both texts and images are extracted to apply a mining tool for event detection. For feature extraction from the text, the bag of words (BoW) method, calculated using the term frequency–inverse document frequency (TF–IDF) technique, is used. For images, the features extracted are: histogram of oriented gradients (HOG) descriptors for object detection, grey-level co-occurrence matrix (GLCM) for texture description, color histogram, and scale-invariant features transform (SIFT). Furthermore, depending on the data, the visual features extraction method changes. Each image is represented as a “bag of visual words”. After that, text features and image features are input to the multiple kernel learning (MKL) for fusion. MKL can automatically combine both feature types in order to achieve the best performance. The proposed system – which includes data collection from Twitter, data pre-processing, feature extraction and representation, data fusion, and event detection – is tested on four data sets from four events. The test events used in this thesis are: Napa earthquake 2014, Washington wildfires 2015, California wildfires 2015, and Illapel earthquake 2015. The method is compared in two ways: the first comparison is with text only or images only. The other comparison method includes three different types of fusion: concatenation fusion, Dempster–Shafer fusion, or kernel-based fusion with sub-gradient descent (SD) optimization. In the Napa earthquake data, for the first comparison, the proposed method achieved the best performance, with a fusion accuracy of 0.95, compared to 0.89 with text only, and 0.87 with images only. For the second comparison, the systems with different fusions achieved 0.92 accuracy with concatenation fusion, 0.91 for Dempster–Shafer fusion, and 0.91 for kernel-based fusion with sub-gradient descent (SD) optimization. The proposed system has demonstrated that event detection from multimedia data in Twitter is enhanced and improved by our approach of using a combination of multiple features for both images and text. The multiple kernel learning (MKL) method that is proposed for fusion improves computational efficiency when handling big volumes of data, and gives better performance when compared to other fusion approaches. This study has assembled systems, infrastructure, and datasets for Twitter data research, resulting in an accurate and effective detection method for detecting events (like disasters) which can be used for spreading awareness and organizing responses. The research presents a breakthrough in terms of risk management strategies, one that can improve public health preparedness or lead to better disaster management actions.
- Subject
- event detection; multimedia data; data mining; machine learning; data fusion; Twitter
- Identifier
- http://hdl.handle.net/1959.13/1354671
- Identifier
- uon:31322
- Rights
- Copyright 2017 Samar Alqhtani
- Language
- eng
- Full Text
- Hits: 1327
- Visitors: 1670
- Downloads: 498
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | ATTACHMENT01 | Thesis | 1 MB | Adobe Acrobat PDF | View Details Download | ||
View Details Download | ATTACHMENT02 | Abstract | 99 KB | Adobe Acrobat PDF | View Details Download |