All Questions
340 questions
0
votes
1
answer
128
views
/profile at the end of my URL causing web-scraping function to not work
I am trying to scrape data from Yahoo finance as part of a function. The URL with "/profile" at the end does not work, but if I take it off, the URL will pull in. Does anyone have any idea ...
0
votes
0
answers
56
views
Two scripts (Python and VBA) with the same logic produce different results when regex is implemented
I've created two scripts: one in Python and the other in VBA. Both scripts do the same thing but produce different results.
I've used a few links within the scripts to scrape a Facebook link from the ...
1
vote
1
answer
55
views
How to handle regex in BeautifulSoup / CSS selector?
I'm looking for a solution to use regex in BeautifulSoup to find elements that may contain the text HO # with possible spaces and ignoring case sensitivity.
check_ho_number3 = soup.select_one('td:-...
-3
votes
1
answer
76
views
Websites Web Scraping Emails using Python [closed]
In my Python code i have regex to find email:
soup = BeautifulSoup(driver.page_source, "html.parser")
text_email = soup.get_text()
emails1 = re.findall(r'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[...
0
votes
1
answer
84
views
How to get each table row result in new list? Python webscraping
I'm trying to scrape company information from a public website. Two questions:
The link of each subpage is very long and complex. For example: https://link/details?id=535145A8-E3FA-DF11-BB5A-...
1
vote
1
answer
163
views
How to scrape Star Rating from Etsy HTML with Python?
I can not find a way to extract the star rating from each product from the Etsy source code.
This is the code I've used to extract the description, price, number of reviews from a set of text files ...
1
vote
1
answer
74
views
Python regular expression did not able to extract the text and urls from the mail body
For example i have a mail in my outlook folder that have a subject and lots of Japanese text and urls like below.
01 事務用品・機器
大阪府警察大正警察署:指サック等の購入 :大阪市大正区
https://www.e-nyusatsu.pref.osaka.jp/CALS/...
1
vote
3
answers
69
views
Need Assistance with a regex pattern in Python – Parsing complex HTML structures
I'm trying to parse complex HTML structures using Python's re module, and I've run into a roadblock with my regex pattern. Here's what I'm trying to do:
I have HTML text that contains nested elements,...
0
votes
1
answer
51
views
regex code to find email address within HTML script webscraping
I am trying to extract phone, address and email from couple of corporate websites through webscraping
My code for that is as follows
l = 'https://www.zimmermanfinancialgroup.com/about'
address_t = []
...
0
votes
1
answer
153
views
How can I extract value of variable from script element in Scrapy
I need to extract some data from a website, I found that all I need is exist in <script> element, So I extracted them with this command:
script = response.css('[id="server-side-container&...
1
vote
1
answer
93
views
Regex python Match after and before a specific string
Lets say we have this
string:"Code:1,Some text some other text {fdf: more text, attr=important "
I want to catch the pattern using Regex that can findall attr and extract important and 1 and ...
0
votes
1
answer
829
views
how to exclude words in regex using Negative Lookahead?
I am trying to exclude a word from a sentence, but if the excluded word does not appear, the regex should keep searching for characters until the exclude word is found.
For example, lets suppose I ...
0
votes
3
answers
117
views
Extracting Key Value pairs from a String using Regex
I have a web scrapped string containing key value pairs i.e firstName:"Quaran", lastName:"McPherson"
st = '{"accountId":405266,"firstName":"Quaran",&...
1
vote
1
answer
76
views
I need to scrape the instagram link that is highlighted in the image
I am trying regex in python. I am facing a problem how to clip out the portion that is<
"www.instagram.com%2FMohakMeet"
I need to know the characters which I need to use in regex.
#...
0
votes
1
answer
36
views
REGEX: how to i get the name more the character " : "
Im using python to extract some info
i wanna get the words/names before the charcter :
but the problem is everythig is tied together
from here
Morgan Stanley.Erik Woodring:
i just wanna extract &...