How Do I Ignore Tags While Getting The .string Of A Beautiful Soup Element?

October 02, 2024 Post a Comment

I'm working with HTML elements that have child tags, which I want to 'ignore' or remove, so that the text is still there. Just now, if I try to .string any element with tags, all I

Solution 1:

for child in soup.find(id='main'):
    ifisinstance(child, bs4.Tag):
        print child.text

And, you'll get:

This is a paragraph.
This is a paragraph with a tag.
This is another paragraph.

Solution 2:

Use the .strings iterable instead. Use ''.join() to pull in all strings and join them together:

print''.join(main.strings)

Iterating over .strings yields each and every contained string, directly or in child tags.

Baca Juga

Regular Expressions Vs Xpath When Parsing Html Text
Import Psaw Brython
Dynamically Populate Drop Down Menu With Selection From Previous Drop Down Menu

Demo:

>>> print''.join(main.strings)

This is a paragraph. 
This is a paragraph with a tag. 
This is another paragraph.

Learn Html5

How Do I Ignore Tags While Getting The .string Of A Beautiful Soup Element?

Solution 1:

Solution 2:

Post a Comment for "How Do I Ignore Tags While Getting The .string Of A Beautiful Soup Element?"