An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

dc.contributor.authorEl-Zini, Julia
dc.contributor.authorRizk, Yara
dc.contributor.authorAwad, Mariette
dc.contributor.departmentDepartment of Electrical and Computer Engineering
dc.contributor.facultyMaroun Semaan Faculty of Engineering and Architecture (MSFEA)
dc.contributor.institutionAmerican University of Beirut
dc.date.accessioned2025-01-24T11:30:35Z
dc.date.available2025-01-24T11:30:35Z
dc.date.issued2021
dc.description.abstractRecurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation through time (BPTT) which is prohibitively expensive, especially when the length of the time dependencies and the number of hidden neurons increase. To reduce the training time, extreme learning machines (ELMs) have been recently applied to RNN training, reaching a 99% speedup on some applications. Due to its non-iterative nature, ELM training, when parallelized, has the potential to reach higher speedups than BPTT. In this work, we present Opt-PR-ELM, an optimized parallel RNN training algorithm based on ELM that takes advantage of the GPU shared memory and of parallel QR factorization algorithms to efficiently reach optimal solutions. The theoretical analysis of the proposed algorithm is presented on six RNN architectures, including LSTM and GRU, and its performance is empirically tested on ten time-series prediction applications. Opt-PR-ELM is shown to reach up to 461 times speedup over its sequential counterpart and to require up to 20x less time to train than parallel BPTT. Such high speedups over new generation CPUs are extremely crucial in real-time applications and IoT environments. © 2021 Julia El Zini et al., published by Sciendo.
dc.identifier.doihttps://doi.org/10.2478/jaiscr-2021-0003
dc.identifier.eid2-s2.0-85117524083
dc.identifier.urihttp://hdl.handle.net/10938/27455
dc.language.isoen
dc.publisherSciendo
dc.relation.ispartofJournal of Artificial Intelligence and Soft Computing Research
dc.sourceScopus
dc.subjectExtreme learning machines (elm)
dc.subjectGated recurrent unit (gru)
dc.subjectGpu implementation
dc.subjectLong-short term memory (lstm)
dc.subjectNon-iterative training
dc.subjectParallelization
dc.subjectRecurrent neural network (rnn)
dc.titleAn Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2021-6201.pdf
Size:
1.25 MB
Format:
Adobe Portable Document Format