RCode issue

10 views
Skip to first unread message

KARTHIK KRISHNAN PhD MGT Delhi 2022

unread,
Jan 27, 2024, 8:04:52 PM1/27/24
to dataanalys...@googlegroups.com
Dear Group members

I am trying to get Amazon reviews of a product through the following r code. There are 2 challenges I face

1) I am unable to loop this across all pages of reviews. so currently my output is limited to only the first page
2) since the number of usernames, ratings and number of reviews are not equal, when I put this in a dataframe, I get errors. Could someone please help me.


library(xml2)
library(rvest)
library(purrr)
library(stringr)

HTML_Dump<- read_html("https://www.amazon.in/Renewed-OnePlus-Mirror-128GB-Storage/product-reviews/B0BS78H3BK/ref=cm_cr_getr_d_paging_btm_prev_1?ie=UTF8&reviewerType=all_reviews&pageNumber=1")

#username
HTML_Dump %>%
  html_nodes(".a-profile-name")%>%
  html_text() -> user_name
length(user_name)

#ratings
HTML_Dump %>%
  html_nodes(".review-rating")%>%
  html_text() -> mobile_rating

length(mobile_rating)

#reviews
HTML_Dump %>%
  html_nodes(".a-row.a-spacing-small.review-data")%>%
  html_text() %>%
  str_split("\\n") %>%
  map_chr(11) %>%
  str_trim() -> mobile_review

length(mobile_review)

Reviews<-data.frame(mobile_review)

dim(Reviews)
head(Reviews)
write.csv(Reviews,"D:\\Research\\Reviews\\Reviews.csv")
Reply all
Reply to author
Forward
0 new messages