Objectives:

1. Explore Amazon reviews

2. Sentimentalize the reviews

3. Word frequency by helpfulness

Workshop Resources

Azure Notebooks Library

Sentiment Notebook

Commoners Notebook

More information

Datasets

http://jmcauley.ucsd.edu/data/amazon/  | Amazon reviews for NLP

http://mpqa.cs.pitt.edu/lexicons/effect_lexicon/ | +/- Effect Lexicon

Packages

http://nlp.johnsnowlabs.com/ | Spark Package for NLP

https://spark.apache.org/docs/latest/ml-guide.html | Spark ML guide – focus on DataFrame based, NOT RDD-based

 

 


Using new PySpark 2.3 Vectorized Pandas UDFs: Lessons Intro to PySpark Workshop 2018-01-24

Leave a Reply

Your email address will not be published. Required fields are marked *