Real-time Text Analytics Pipeline Using Open-source Big Data Tools
Authors:
Hassan Nazeer,
Waheed Iqbal,
Fawaz Bokhari,
Faisal Bukhari,
Shuja Ur Rehman Baig
Abstract:
Real-time text processing systems are required in many domains to quickly identify patterns, trends, sentiments, and insights. Nowadays, social networks, e-commerce stores, blogs, scientific experiments, and server logs are main sources generating huge text data. However, to process huge text data in real time requires building a data processing pipeline. The main challenge in building such pipeli…
▽ More
Real-time text processing systems are required in many domains to quickly identify patterns, trends, sentiments, and insights. Nowadays, social networks, e-commerce stores, blogs, scientific experiments, and server logs are main sources generating huge text data. However, to process huge text data in real time requires building a data processing pipeline. The main challenge in building such pipeline is to minimize latency to process high-throughput data. In this paper, we explain and evaluate our proposed real-time text processing pipeline using open-source big data tools which minimize the latency to process data streams. Our proposed data processing pipeline is based on Apache Kafka for data ingestion, Apache Spark for in-memory data processing, Apache Cassandra for storing processed results, and D3 JavaScript library for visualization. We evaluate the effectiveness of the proposed pipeline under varying deployment scenarios to perform sentiment analysis using Twitter dataset. Our experimental evaluations show less than a minute latency to process $466,700$ Tweets in $10.7$ minutes when three virtual machines allocated to the proposed pipeline.
△ Less
Submitted 12 December, 2017;
originally announced December 2017.
On the Use of Smart Ants for Efficient Routing in Wireless Mesh Networks
Authors:
Fawaz Bokhari,
Gergely Zaruba
Abstract:
Routing in wireless mesh networks (WMNs) has been an active area of research for the last several years. In this paper, we address the problem of packet routing for efficient data forwarding in wireless mesh networks (WMNs) with the help of smart ants acting as intelligent agents. The aim of this paper is to study the use of such biologically inspired agents to effectively route the packets in WMN…
▽ More
Routing in wireless mesh networks (WMNs) has been an active area of research for the last several years. In this paper, we address the problem of packet routing for efficient data forwarding in wireless mesh networks (WMNs) with the help of smart ants acting as intelligent agents. The aim of this paper is to study the use of such biologically inspired agents to effectively route the packets in WMNs. In particular, we propose AntMesh, a distributed interference-aware data forwarding algorithm which enables the use of smart ants to probabilistically and concurrently perform the routing and data forwarding in order to stochastically solve a dynamic network routing problem. AntMesh belongs to the class of routing algorithms inspired by the behaviour of real ants which are known to find a shortest path between their nest and a food source. In addition, AntMesh has the capability to effectively utilize the space/channel diversity typically common in multi radio WMNs and to discover high throughput paths with less inter-flow and intra-flow interference while conventional wireless network routing protocols fail to do so. We implement our smart ant-based routing algorithm in ns-2 and carry out extensive evaluation. We demonstrate the stability of AntMesh in terms of how quickly it adapts itself to the changing dynamics or load on the network. We tune the parameters of AntMesh algorithm to study the effect on its performance in terms of the routing load and end-to-end delay and have tested its performance under various network scenarios particularly fixed nodes mesh networks and also on mobile WMN scenarios. The results obtained show AntMesh's advantages that make it a valuable candidate to operate in mesh networks.
△ Less
Submitted 4 September, 2012;
originally announced September 2012.