Semantic Caching Demo on SparkSQL and HDFS

Overview: A caching system management used in SparkSQL/Apache Spark and HDFS

Requirements: SparkSQL, ApacheSpark, HDFS, Data caching algorithm and English skills.

Save your time - order a paper!

Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines

Order Paper Now

Motivation: We are now using the HDFS to store and manage the data. And we do also the data analytics using SparkSQL in Apache Spark.

Normally, the Application Driver of Apache Spark will load the distributed data in HDFS into memory and do the processing. Finally, Spark will return the results to the HDFS. Sometime, the results we got from previous queries could be used again once or more time by next queries. Then, we want to cache these result in our memory long enough by using a mechanism of caching, it called semantic caching.

What we want is: an implementation of semantic caching program in Apache Spark. The program should be done by Scala/Java language but Scala is preferable.