مقالات پژوهشی

Multimodal movie genre classification using recurrent neural network

Published in Multimedia Tools and Applications, 2022 [Link] [Code] [PDF]

Authors

Tina Behrouzi, Ramin Toosi, Mohammad Ali Akhaee

Abstract

Genre is one of the features of a movie that defines its structure and type of audience. The number of streaming companies interested in automatically deriving movies’ genres is rapidly increasing. Genre categorization of trailers is a challenging problem because of the conceptual nature of the genre, which is not presented physically within a frame and can only be perceived by the whole trailer. Moreover, several genres may appear in the movie at the same time. The multi-label learning algorithms have not been improved as significantly as the single-label classification models, which causes the genre categorization problem to be highly complicated. In this paper, we propose a novel multi-modal deep recurrent model for movie genre classification. A new structure based on Gated Recurrent Unit (GRU) is designed to derive spatial-temporal features of movie frames. The video features are then concatenated with the audio features to predict the final genres of the movie. The proposed design outperforms the state-of-art models based on accuracy and computational cost and substantially improves the movie genre classifier system’s performance.

An automatic spike sorting algorithm based on adaptive spike detection and a mixture of skew-t distributions

Published in Scientific Reports, 2021 [Link] [Code] [PDF]

Authors

Ramin Toosi, Mohammad Ali Akhaee, Mohammad-Reza A Dehaqani

Abstract

Developing high-density electrodes for recording large ensembles of neurons provides a unique opportunity for understanding the mechanism of the neuronal circuits. Nevertheless, the change of brain tissue around chronically implanted neural electrodes usually causes spike wave-shape distortion and raises the crucial issue of spike sorting with an unstable structure. The automatic spike sorting algorithms have been developed to extract spikes from these big extracellular data. However, due to the spike wave-shape instability, there have been a lack of robust spike detection procedures and clustering to overcome the spike loss problem. Here, we develop an automatic spike sorting algorithm based on adaptive spike detection and a mixture of skew-t distributions to address these distortions and instabilities. The adaptive detection procedure applies to the detected spikes, consists of multi-point alignment and statistical filtering for removing mistakenly detected spikes. The detected spikes are clustered based on the mixture of skew-t distributions to deal with non-symmetrical clusters and spike loss problems. The proposed algorithm improves the performance of the spike sorting in both terms of precision and recall, over a broad range of signal-to-noise ratios. Furthermore, the proposed algorithm has been validated on different datasets and demonstrates a general solution to precise spike sorting, in vitro and in vivo.

Time–frequency analysis of keystroke dynamics for user authentication

Published in Future Generation Computer Systems, 2021 [Link]

Authors

Ramin Toosi, Mohammad Ali Akhaee

Abstract

With the increasing need for information security, one of the key options ahead is to provide security based on biometrics. Authentication based on keystroke dynamics is a low cost and convenient biometric authentication technique. In this paper, a method for user authentication based on keystroke dynamics with a novel similarity measure is introduced. Using time–frequency analysis, a similarity measure between an input sample and user reference samples is directly obtained. The input sample is initially converted to a keystroke dynamics signal. Dynamic time warping method is applied to equalize the length of signals. Then, using Wigner distribution, the time–frequency representation of the samples is obtained. Finally, exploiting the correlation coefficient, the similarity between two signals in the time–frequency domain is measured. We also added an update procedure to the proposed method to enhance its performance. The performance of the proposed method is investigated and compared with the state-of-the-art methods. Experimental results show the superiority of the proposed method.

Hate Sentiment Recognition System For Persian Language

Published in International Conference on Computer and Knowledge Engineering (ICCKE), 2022 [Link]

Authors

Pegah Shams Jey, Arash Hemmati, Ramin Toosi, Mohammad Ali Akhaee

Abstract

People’s lives in societies are tied to social networks and these networks face problems such as the existence of hateful speech. Most social networks try to identify and prevent the spread of this phenomenon by using natural language processing (NLP) methods. On the internet, hate speech causes arguments between different groups in society. Given that anyone is able to put any content on social media in the form of a short text, this leads to the uncontrollable spread of hatred on social networks and can cause harm to individuals and various groups in society. This is necessary to have control over users’ content on social media. In this study, a method for identifying hateful content in short texts is proposed. First, cosine similarities of word-based and character-based n-grams (features) and sentences (samples) are calculated. Then, employing calibrated support vector machine, the probability of each feature related to hatred is calculated. Finally, another SVM is applied for the final classification. The proposed method is compared to the state-of-the-art methods such as pars-bert and a multi-view SVM approach using Instagram comments in terms of various performance metrics. Results show that the proposed method outperforms the previous studies.

Automated Person Identification from Hand Images using Hierarchical Vision Transformer Network

Published in International Conference on Computer and Knowledge Engineering (ICCKE), 2022 [Link]

Authors

Zahra Ebrahimian, Seyed Ali Mirsharji, Ramin Toosi, Mohammad Ali Akhaee

Abstract

Nowadays, person identification is widely used for security purposes. Identity verification is done using a variety of techniques. Biometric authentication is the most well-known and popular secure kind of authentication in most devices. In this research, dorsal and palmar hand images, which are regarded as two important biometric characteristics, are both used for biometric authentication. In order to take into account both global and local variables for determining human identity, We propose a two-stream hierarchical vision transformer with two independent inputs of the whole hand image and knuckle sub-images drawn from the 11k-Hand dataset. As a result of this approach, we achieved an accuracy of 99.4% and an error rate of 2.47% to identify people.

Multinomial Emoji Prediction Using Deep Bidirectional Transformers and Topic Modeling

Published in International Conference on Electrical Engineering (ICEE), 2022 [Link]

Authors

Zahra Ebrahimian, Ramin Toosi, Mohammad Ali Akhaee

Abstract

The more social media take its place in our lives; the more critical their analysis becomes and the more researchers’ attention is drawn to it. Studies contain various topics such as sentiment analysis, trend prediction, bot detection, Etc. Here, for the first time, we propose a novel method to predict the job title of social media users. Twitter, a popular social media, is our target social media. We introduce a dataset consisting of 1314 samples, including users’ tweets and bios. The user’s job title is found using Wikipedia crawling. The challenge of multiple job titles per user is handled using a semantic word embedding and clustering method. Then, a job prediction method is introduced based on a deep neural network and TF-IDF word embedding. We also use hashtags and emojis in the tweets for job prediction. Results show that the job title of users in Twitter could be well predicted with 54% accuracy in nine categories.

Job Title Prediction from Tweets Using Word Embedding and Deep Neural Networks

Published in International Conference on Electrical Engineering (ICEE), 2022 [Link] [Code]

Authors

Shayan Vassef, Ramin Toosi, Mohammad Ali Akhaee

Abstract

The more social media take its place in our lives; the more critical their analysis becomes and the more researchers’ attention is drawn to it. Studies contain various topics such as sentiment analysis, trend prediction, bot detection, Etc. Here, for the first time, we propose a novel method to predict the job title of social media users. Twitter, a popular social media, is our target social media. We introduce a dataset consisting of 1314 samples, including users’ tweets and bios. The user’s job title is found using Wikipedia crawling. The challenge of multiple job titles per user is handled using a semantic word embedding and clustering method. Then, a job prediction method is introduced based on a deep neural network and TF-IDF word embedding. We also use hashtags and emojis in the tweets for job prediction. Results show that the job title of users in Twitter could be well predicted with 54% accuracy in nine categories.

Fast and Temporal Consistent Video Style Transfer

Published in International Conference on Pattern Recognition and Image Analysis (IPRIA), 2021 [Link]

Authors

Ali Abbasi, Ramin Toosi, Mohammad Ali Akhaee

Abstract

Style transfer is a subset of image transformation problem in which the output is a combination of the style of a reference image and the content of an input image. Despite recent advantages in image processing, video style transfer is still a challenging problem. One can implement such methods on a video, considering each frame independently. However, an unpleasant flickering effect will be observable in the output. Here, we propose a method to properly transfer a reference style to the input frames while preserving the temporal consistency. Thus, the flickering effect will be considerably mitigated. The proposed method is compared to two video style transfer methods using Sintel dataset. Results show that the proposed method keeps a better trade-off between temporal consistency and spatial losses.

Face manifold: manifold learning for synthetic face generation

Published in Multimedia Tools and Applications, Springer US, 2023

Authors

K Dinashi, Ramin Toosi, Mohammad Ali Akhaee

Abstract

The face is a crucial aspect of human communication and identity. Accurately estimating face structure is a fundamental task in computer vision, with significant applications in various fields, including facial recognition and medical surgeries. Deep learning techniques have made notable progress in 3D face reconstruction from 2D images. However, this approach demands large 3D face datasets, often tackled by synthetic face generation. Unfortunately, synthetic datasets can contain non-possible faces, which pose significant challenges. This paper presents a novel approach to synthetic diverse face dataset generation by leveraging face manifold learning. We divide the face structure into shape and expression groups and use a fully convolutional autoencoder network to handle non-possible faces while preserving dataset diversity. The proposed method is used to train deep 3D reconstruction networks and results …

Farsi CAPTCHA Recognition Using Attention-Based Convolutional Neural Network

Published in 2023 9th International Conference on Web Research (ICWR), IEEE, 2023

Authors

Matine Hajyan, Alireza Hosseni, Ramin Toosi, Mohammad Ali Akhaee

Abstract

Getting around CAPTCHAs is essential for stopping fraudulent online activity. The creation of efficient CAPTCHA-breaking algorithms in the context of Persian can help safeguard Farsi-speaking users from a variety of online dangers and enhance their overall online experience. This study offers a novel method for recognizing Persian CAPTCHAs, which was developed and tested on a large and distinctive dataset. Our approach to Farsi CAPTCHA recognition leverages deep learning models, specifically a combination of the TPS-Resnet-BiLSTM-ATTN model, which surpasses other approaches and breaks Farsi CAPTCHAs with the highest possible accuracy. We have achieved amazing results with promising implications for boosting the security and usability of many online services that depend on CAPTCHA authentication by delving deeply into the impact of attention modules on CAPTCHA recognition.

Persian Ezafeh Recognition using Transformer-Based Models

Published in 2023 9th International Conference on Web Research (ICWR), IEEE, 2023

Authors

Ali Ansari, Zahra Ebrahimian, Ramin Toosi, Mohammad Ali Akhaee

Abstract

In Persian, the grammatical particle ezafe connects two words. Ezafe is one of the salient factors in Persian phonology and morphology to understand the meaning of a sentence completely and truly, whereas it is not usually written in sentences, resulting in mistakes in reading complex sentences and errors in natural language processing tasks. Therefore, recognizing words that need Ezafe at the end of themselves, is a major factor to improve the performance of a variety of NLP-based systems such as a Text TTSsystem. Because in Persian TTS systems without an Ezafe recognition module cannot make Ezafe constructions to read the text correctly and does not recognize the relations between the words. As Transformer-based methods shows state-of-the-art results in lots of NLP tasks, in this paper, we experiment ParsBERT in the task of ezafe recognition. The latter earning 2.68% better F1-score than the prior state …

Bilingual COVID-19 Fake News Detection Based on LDA Topic Modeling and BERT Transformer

Published in 2023 6th International Conference on Pattern Recognition and Image Analysis (IPRIA), IEEE, 2023

Authors

Pouria Omrani, Zahra Ebrahimian, Ramin Toosi, Mohammad Ali Akhaee

Abstract

The spread of fake news has become more prevalent given the popularity of social media and the various news that circulates on it. As a result, it is crucial to discern between real and fake news. During the COVID-19 pandemic, there have been numerous tweets, posts, and news about this illness in social media and electronic media worldwide. This research presents a bilingual model combining Latent Dirichlet Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian and English. First, the dataset is prepared in Persian and English, and then the proposed method is used to detect COVID-19 fake news on the prepared dataset. Finally, the proposed model is evaluated using various metrics such as accuracy, precision, recall, and the f1-score. As a result of this approach, we achieve 92.18% accuracy, which shows that adding topic information to the pre-trained …