Collecting product reviews from the most popular online sources using a variety of techniques.

name = 'reviews'
allowed_domains = ['amazon.com']
start_urls = ["https://www.amazon.com/GoPro-Fusion-Waterproof-Digital-Spherical/product-reviews/B0792MJLNM/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"]

def parse(self, response):
     for item in response.css('.a-section.review'):
          if item.css('div::attr(data-hook)').extract_first() == 'review':
               yield {
                    'review_id': item.css('div.a-section.celwidget::attr(id)').extract_first().split('-')[1],
                    'author': item.css('span.a-profile-name::text').extract_first(),
                    'review': ' '.join(item.css('span.review-text::text').extract())
               }

next_page = response.css('.a-last > a::attr(href)').extract_first()
if next_page is not None:
     yield response.follow(next_page, callback=self.parse)

This data can then provide sentiment and value in a way that can help your company achieve its highest product and customer experiences.
In this very basic example, the reviews page for a GoPro sold on Amazon is visited, and the reviewer’s name, id and review are collected.

Share This Story, Choose Your Platform!