Web and Network Data Science: Modeling Techniques in Predictive AnalyticsWeb and Network Data Science: Modeling Techniques in Predictive Analytics
By Thomas W. Miller

 
Programs and Data to Accompany "Web and Network Data Science: Modeling Techniques in Predictive Analytics" Miller (2015)

Note that many R programs contain library commands for bringing in R functions included in packages. To run these programs, the user needs to first install the packages in his/her R environment. Likewise for Python programs, many utilize data structures and methods that require the prior installation and importing of Python packages.

R programs were tested under R 3.1.1 on Mac OS 10.6.8. Python programs were tested under Enthought Canopy and Python 2.7 on Mac OS 10.6.8.

 


Book Location Description of Directory or File File Name
WNDS Chapter 1 Browser Usage Data browser_usage_2008_2014.csv
  Analysis of Browser Usage (Python) wnds_chapter_1.py
  Analysis of Browser Usage (R) wnds_chapter_1.R
     
WNDS Chapter 2 ToutBay Website Traffic Data toutbay_begins.csv
  Website Traffic Analysis (R) wnds_chapter_2.R
     
WNDS Chapter 3 Extracting and Parsing Web Site Data (Python) wnds_chapter_3a.py
  Extracting and Parsing Web Site Data (R) wnds_chapter_3a.R
  Directory for Simple One-Page Web Scraper (Python) wnds_chapter_3b
  Directory for Crawling and Scraping while Napping (Python) wnds_chapter_3c
     
WNDS Chapter 4 Identifying Keywords for Testing Performance in Search (R) wnds_chapter_4.R
  Directory of Keywords Data for the Angels tickets_angels
  Directory of Keywords Data for the Dodgers tickets_dodgers
     
WNDS Chapter 5 Competitive Intelligence: Spirit Airlines Financial Dossier (R) wnds_chapter_5.R
     
WNDS Chapter 6 Enron E-Mail Network Data enron_email_links.txt
  Defining and Visualizing Simple Networks (Python) wnds_chapter_6a.py
  Defining and Visualizing Simple Networks (R) wnds_chapter_6a.R
  Visualizing Networks-Understanding Organizations (R) wnds_chapter_6b.R
     
WNDS Chapter 7 Correlation Heat Map Utility (R) correlation_heat_map_utility.R
  Wikipedia Votes Data wiki_edges.txt
  Networks Models and Measures (R) wnds_chapter_7a.R
  Methods of Sampling from Large Networks (R) wnds_chapter_7b.R
     
WNDS Chapter 8 Sentiment Analysis Negative Word List (text data) Hu_Liu_negative_word_list.txt
  Sentiment Analysis Positive Word List (text data) Hu_Liu_positive_word_list.txt
  Directories and Subdiretories of Movie Reviews (text data)  
 
Training Data - Unsupervised/Unrated Reviews
reviews/train/unsup
 
Training Data - Positive Reviews
reviews/train/pos
 
Training Data - Negative Reviews
reviews/train/neg
 
Test Data - Positive Reviews
reviews/test/pos
 
Test Data - Negative Reviews
reviews/test/neg
 
Test Data - Tom's Reviews
reviews/test/tom
  Split-plotting Utilities (R) R_utility_program_3.R
  Text Scoring Script for Sentiment Analysis (R) R_utility_program_5.R
  Initializer Module (Python) __init__.py
  Utility Functions (Python) python_utilities.py
 
Evaluating the Predictive Accuracy of a Binary Classifier
 
 
Text Measures for Sentiment Analysis
 
 
Summative Scoring of Sentiment
 
  Sentiment Analysis and Classification of Movie Ratings (Python) wnds_chapter_8_program.py
  Sentiment Analysis and Classification of Movie Ratings (R) wnds_chapter_8_program.R
     
WNDS Chapter 9 Directory of POTUS Speeches Data Organized by President Name (Oral Addresses Kennedy through Obama) ALL_POTUS
  Directory of PUTUS Speeches Data (Oral Addresses Kennedy through Obama) POTUS
  Discovering Common Themes: POTUS Speeches (Python) wnds_chapter_9a.py
  Multidimensional Scaling Results POTUS_mds.csv
  Making Word Clouds: POTUS Speeches (R) wnds_chapter_9b.R
  From Text Measures to Text Maps: POTUS Speeches (R) wnds_chapter_9c.R
     
WNDS Chapter 10 Anonymous Microsoft Web Attribute Data microsoft_attribute_data.csv
  Anonymous Microsoft Web Test Data microsoft_test_data.csv
  Anonymous Microsoft Web Training Data microsoft_training_data.csv
  From Rules to Recommendations: The Microsoft Case (R) wnds_chapter_10.R
  Anonymous Microsoft Web Data Organized as Transactions (partial output from wnds_chapter10.R) microsoft_training_transactions.csv
     
WNDS Chapter 11 Directory of NetLogo Simulation Results NetLogo_results
  NetLogo Results Data virus_results.csv
  Analysis of Agent-Based Simulation Results (Python) wnds_chapter_11.py
  Analysis of Agent-Based Simulation Results (R) wnds_chapter_11.R
     
WNDS Appendix C E-Mail or Spam Case Study Data email_or_spam.csv
  ToutBay Website Traffic Data toutbay_begins.csv
  Enron E-Mail Network Data enron_email_links.txt
  Directory of POTUS State of the Union Addresses (Oral and written, all Presidents) POTUS_COMPLETE
  Directory of POTUS Speeches Data Organized by President Name (Oral Addresses Kennedy through Obama) ALL_POTUS
  Directory of PUTUS Speeches Data (Oral Addresses Kennedy through Obama) POTUS
  Directory of Keywords Data for the Angels tickets_angels
  Directory of Keywords Data for the Dodgers tickets_dodgers
  Wikipedia Votes Case Study Data wiki_edges.txt
  Anonymous Microsoft Web Attribute Data microsoft_attribute_data.csv
  Anonymous Microsoft Web Test Data microsoft_test_data.csv
  Anonymous Microsoft Web Training Data microsoft_training_data.csv
     
WNDS Appendix D D Utility Functions (Python) python_utilities.py
 
Evaluating the Predictive Accuracy of a Binary Classifier
 
 
Text Measures for Sentiment Analysis
 
 
Summative Scoring of Sentiment
 
  Split-plotting Utilities (R) R_utility_program_3.R
  Text Scoring Script for Sentiment Analysis (R) R_utility_program_5.R
  Correlation Heat Map Utility (R) correlation_heat_map_utility.R