Data algorithms /

Parsian, Mahmoud,

Data algorithms / Data algorithms : recipes for scaling up with Hadoop and Spark Mahmoud Parsian. - 1st ed. - Beijing : O'Reily ; c. 2015. - xxxvii, 737 p. : ill. ; 23 cm.

Includes bibliographic references (page 721-723) and index.

Secondary sort : introduction -- Secondary sort : a detailed example -- Top 10 list -- Left outer join -- Order inversion -- Moving average -- Market basket analysis -- Common friends -- Recommendation engines using MapReduce -- Content-based recommendation : movies -- Smarter email marketing with the Markov Model -- K-means clustering -- k-nearest neighbors -- Naive bayes -- Sentiment analysis -- Finding, counting, and listing all triangles in large graphs -- K-mer counting -- DNA sequencing -- Cox regression -- Cochran-Armitage test for trend -- Allelic frequency -- The T-test -- Pearson correlation -- DNA base count -- RNA sequencing -- Gene aggregation -- Linear regression -- MapReduce and monoids -- The small files problem -- Huge cache for MapReduce -- The bloom filter.


Apache Hadoop.
SPARK (Electronic resource)


Apache Hadoop
Computer programming.





004 / PAR