Collecting product reviews from the most popular online sources using a variety of techniques.

name = 'reviews'
allowed_domains = ['']
start_urls = [""]

def parse(self, response):
     for item in response.css(''):
          if item.css('div::attr(data-hook)').extract_first() == 'review':
               yield {
                    'review_id': item.css('div.a-section.celwidget::attr(id)').extract_first().split('-')[1],
                    'author': item.css('span.a-profile-name::text').extract_first(),
                    'review': ' '.join(item.css('').extract())

next_page = response.css('.a-last > a::attr(href)').extract_first()
if next_page is not None:
     yield response.follow(next_page, callback=self.parse)

This data can then provide sentiment and value in a way that can help your company achieve its highest product and customer experiences.
In this very basic example, the reviews page for a GoPro sold on Amazon is visited, and the reviewer’s name, id and review are collected.

Share This Story, Choose Your Platform!