scrapegraphai.utils.screenshot_scraping package

Submodules

scrapegraphai.utils.screenshot_scraping.screenshot_preparation module

screenshot_preparation module

scrapegraphai.utils.screenshot_scraping.screenshot_preparation.crop_image(image, LEFT=None, TOP=None, RIGHT=None, BOTTOM=None, save_path: Optional[str] = None)

Crop an image using the specified coordinates. :param image: The image to be cropped. :type image: PIL.Image :param LEFT: The x-coordinate of the left edge of the crop area. Defaults to None. :type LEFT: int, optional :param TOP: The y-coordinate of the top edge of the crop area. Defaults to None. :type TOP: int, optional :param RIGHT: The x-coordinate of :type RIGHT: int, optional :param the right edge of the crop area. Defaults to None.: :param BOTTOM: The y-coordinate of the :type BOTTOM: int, optional :param bottom edge of the crop area. Defaults to None.: :param save_path: The path to save the cropped image. Defaults to None. :type save_path: str, optional

Returns:

The cropped image.

Return type:

PIL.Image

Notes

If any of the coordinates (LEFT, TOP, RIGHT, BOTTOM) is None, it will be set to the corresponding edge of the image. If save_path is specified, the cropped image will be saved as a JPEG file at the specified path.

scrapegraphai.utils.screenshot_scraping.screenshot_preparation.select_area_with_ipywidget(image)

Allows you to manually select an image area using ipywidgets. It is recommended to use this function if your project is in Google Colab, Kaggle or other similar platform, otherwise use select_area_with_opencv(). :param image: The input image. :type image: PIL Image

Returns:

None

scrapegraphai.utils.screenshot_scraping.screenshot_preparation.select_area_with_opencv(image)

Allows you to manually select an image area using OpenCV. It is recommended to use this function if your project is on your computer, otherwise use select_area_with_ipywidget(). :param image: The image from which to select an area. :type image: PIL.Image

Returns:

A tuple containing the LEFT, TOP, RIGHT, and BOTTOM coordinates of the selected area.

async scrapegraphai.utils.screenshot_scraping.screenshot_preparation.take_screenshot(url: str, save_path: Optional[str] = None, quality: int = 100)

Takes a screenshot of a webpage at the specified URL and saves it if the save_path is specified. :param url: The URL of the webpage to take a screenshot of. :type url: str :param save_path: The path to save the screenshot to. Defaults to None. :type save_path: str :param quality: The quality of the jpeg image, between 1 and 100. Defaults to 100. :type quality: int

Returns:

The screenshot of the webpage as a PIL Image object.

Return type:

PIL.Image

scrapegraphai.utils.screenshot_scraping.text_detection module

text_detection_module

scrapegraphai.utils.screenshot_scraping.text_detection.detect_text(image, languages: list = ['en'])

Detects and extracts text from a given image. :param image: The input image to extract text from. :type image: PIL Image :param lahguages: A list of languages to detect text in. Defaults to [“en”]. List of languages can be found here: https://github.com/VikParuchuri/surya/blob/master/surya/languages.py :type lahguages: list

Returns:

The extracted text from the image.

Return type:

str

Notes

Model weights will automatically download the first time you run this function.

Module contents