Skip to main content

All Questions

-1 votes
0 answers
74 views

How to scrape the full New York Times article content using Selenium and BeautifulSoup without triggering the "Please enable JavaScript" message?

I'm building a scraper that fetches full article content from the New York Times using both the Article Search API and a hybrid static + Selenium-based HTML scraper. My goal is to extract complete ...
Abhishek Joshi's user avatar
0 votes
1 answer
100 views

Why can't I extract listings information

I am trying to extract the EPC rating from each listings. you can only get the EPC rating when you click on the listing. each time i run my script it it's keeps timing out, what could be the issue ? ...
Chioma Okoroafor's user avatar
0 votes
1 answer
143 views

Can't use captcha solver api services due to missing the element "data-sitekey"

I'm tryingto to use a captcha solving service but all of them request a code that's attached to an element called "data-sitekey" that should be in the HTML of a page with recaptcha on it. ...
Kason's user avatar
  • 1
0 votes
1 answer
73 views

Get data hidden in ellipses while web scraping

I'm attempting to grab episode title shown at the header of this website. When inspecting the page elements myself I can see near the top a line of HTML like this: <h1 id="epName">...&...
Dmitri's user avatar
  • 1
0 votes
2 answers
78 views

How to parse HTML hidden behind JS scripts

The FCC has a database with details about various broadcast licenses. Many of these licenses have pages like this one Most of the data on these pages (and related ones) can be scraped very easily with ...
Jonsey's user avatar
  • 1
0 votes
1 answer
151 views

scrape a dropdown list using playwright

I'm struggling to find a way to click on the "All" option in a dropdown list and scrape all the content inside that page. I have come across a few posts but they're a little different from ...
Anh Tu Pham's user avatar
0 votes
0 answers
84 views

scrape link for jwplayer calculated with JS using python

I'm trying to scrape video link (m3u8) from this website: https://deaddrive.xyz/embed/fa31e While inspecting the page, I realized that the link is calculated on the fly using JS in the function: &...
Gaurav Suman's user avatar
0 votes
2 answers
182 views

How to click the next link with Zyte browser automation?

The Zyte tutorial "Create your first spider" crawls this page which has a pager with a "normal" next link. But what if the next link contains only a href="#" and executes ...
Ralf Zosel's user avatar
0 votes
1 answer
38 views

How to exclude div classes 'modal-content' and 'modal-body' from pyppeteer web scraper?

I'm building a scraper that gets text data from a list of articles. A common specimen in the text content I'm scraping at the minute is that at the bottom there is this message: "As a subscriber, ...
Shehzadi Aziz's user avatar
0 votes
1 answer
52 views

Extracting the text between span tags in a Javascript-rendered page using Selenium in Python

I am trying to scrape all instances of text between tags with a particular class on a web page that dynamically updates. I am using selenium with a chrome WebDriver in Python. In a normal browser, ...
zicari's user avatar
  • 5
2 votes
3 answers
95 views

scraping table from web page

I'm trying to scrape a table from a webpage using Selenium and BeautifulSoup but I'm not sure how to get to the actual data using BeautifulSoup. webpage: https://leetify.com/app/match-details/5c438e85-...
Horde Bob's user avatar
0 votes
1 answer
242 views

Using Python with Selenium and BeautifulSoup4 how can i get data after Javascript has loaded all elements on the page?

I'm trying to scrape data from a sandbox website just to practice and start using python to scrape web data. I have managed to extract a lot of data using the basics however I have found an element ...
mattie malling's user avatar
0 votes
0 answers
65 views

Weird API response with <script> JavaScript tag

I am working on a project in Python that scrapes a university portal website to retrieve a weekly schedule. I see from the developer tools that the schedule page makes a API call and receives a JSON ...
AmaFor's user avatar
  • 13
0 votes
1 answer
159 views

make client send http request for backend flask

I am trying to avoid rate limiting and ip blacklisting while accessing an external api. I want to deploy a flask web app on google app engine. I need a way to have the client send http requests to the ...
Bear's user avatar
  • 11
0 votes
0 answers
23 views

Is there a way to mimic the Element.closest() function from javascript in Scrapy python?

I am trying to convert my web-scraper I built in JavaScript using the puppeteer library into a python-based web-scraper running on Scrapy. I want to be able to do something similar to JavaScript's ...
Christopher Cho's user avatar

15 30 50 per page
1
2 3 4 5
50