2022.07.15

Seleniumを試してみた

テクログ

記事の目次をとじる

概要
環境構築
指定したURLを起動する方法
コンソールを表示する方法
ログインする方法
スマホでアクセスする方法
スクショをとる方法
スクレイピングの方法
最後に

お久しぶりです。

JGです。

今回はSeleniumを試してみたので、 Seleniumについて簡単に説明したいと思います。

概要

Seleniumは一言で表現すると、ブラウザでのテストを自動化できるツールです。
テストしたい内容を下記のプログラミング言語で書きます。

Seleniumに対応しているプログラミング言語一覧
・Java
・Python
・C##
・Ruby
・JavaScript
・Kotlin

環境構築

Windowsの手順しか書いておりませんが、Macも似たような手順だと思います。

3でchromedriverをダウンロードする際ですが、Chromeのバージョンと同じものをダウンロードするようにして下さい。

WebDriverは、ブラウザをプログラムから自動的に操作するためのツールになります。

1.Pythonをインストール
2.コマンドプロンプトで下記を実行
pip install selenium
3.下記URLにアクセスし、「chromedriver_win32.zip」をダウンロードし、解凍したものを作業ディレクトリに置く
https://chromedriver.chromium.org/downloads

指定したURLを起動する方法

ここからはSeleniumできることをPythonを使い、紹介していきたいと思います。

Seleniumのバージョン4で書いています。

コピペで動くはずです。

dc[‘acceptSslCerts’] 辺りの処理は自己証明書を求められるページで必要になります。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

# Selenium4ではこの書き方だとエラーが出るが起動する
# browser = webdriver.Chrome('chromedriver.exe')

# この書き方だと自己証明書を求められるページでは「プライバシーが保護されません」が表示される
# chrome_service = fs.Service(executable_path = 'chromedriver')
# browser = webdriver.Chrome(service = chrome_service)

# Selenium4でエラーが出ず、起動する方法
dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
chrome_service = fs.Service(executable_path = 'chromedriver')
browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc)

# 指定したURLを起動する
browser.get('https://www.yahoo.co.jp/')

# ブラウザを閉じる
browser.close()

コンソールを表示する方法

ブラウザのコンソールの内容を表示する方法です。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
chrome_service = fs.Service(executable_path = 'chromedriver')
browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc)

# 指定したURLを起動する
browser.get('https://www.yahoo.co.jp/')

# コンソールを表示
for entry in browser.get_log('browser'):
    print(entry)

# ブラウザを閉じる
browser.close()

ログインする方法

IDとパスワードを入力して、ログインする方法です。

find_element(By.ID)でid属性を取得します。ちなみにfind_element(By.NAME)でname属性を取得します。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.common.by import By
import time

dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
chrome_service = fs.Service(executable_path = 'chromedriver')
browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc)
 

# 指定したURLを起動する
browser.get('https://login.yahoo.co.jp/config/login?.src=www&.done=https://www.yahoo.co.jp/')


# コンソールを表示
for entry in browser.get_log('browser'):
	print(entry)

# ログインIDを入力
browser.find_element(By.ID, 'username').send_keys('id')


# ボタンをクリック
browser.find_element(By.ID, 'btnNext').click()
time.sleep(5)


# パスワードを入力
password = browser.find_element_by_name('passwd') 
password.clear()
password.send_keys('password')

# ボタンをクリック
browser.find_element(By.ID, 'btnSubmit').click()


# コンソールを表示
for entry in browser.get_log('browser'):
	print(entry)


# ブラウザを閉じる
	
browser.close()

スマホでアクセスする方法

スマホでアクセスする方法ですが、deviceNameに端末名を指定するようです。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
dc['goog:loggingPrefs'] = {'browser': 'ALL'}
chrome_service = fs.Service(executable_path = 'chromedriver')

# スマホでアクセスする設定
mobile_emulation = {'deviceName': 'Galaxy Fold'}
options = webdriver.ChromeOptions()
options.add_experimental_option('mobileEmulation', mobile_emulation)

browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc, options = options)

# 指定したURLを起動する
browser.get('https://www.yahoo.co.jp/')

# コンソールを表示
for entry in browser.get_log('browser'):
    print(entry)

# ブラウザを閉じる
browser.close()

スクショをとる方法

スクショをとる方法です。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import os
import sys
import datetime

dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
dc['goog:loggingPrefs'] = {'browser': 'ALL'}
chrome_service = fs.Service(executable_path = 'chromedriver')
browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc)

# 指定したURLを起動する
browser.get('https://www.yahoo.co.jp/')

# スクショをとる
w = browser.execute_script('return document.body.scrollWidth;')
h = browser.execute_script('return document.body.scrollHeight;')
browser.set_window_size(w, h)
dt_now = datetime.datetime.now()
# このファイルと同じ階層にimagesフォルダを作成済みでないとエラーになるので注意
file_name = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'images/'+dt_now.strftime('%Y%m%d%H%M%S')+'.png')
browser.save_screenshot(file_name)

# コンソールを表示
for entry in browser.get_log('browser'):
    print(entry)

# ブラウザを閉じる
browser.close()

スクレイピングの方法

最後はスクレイピングの方法です。

from selenium import webdriver
from selenium.webdriver.chrome import service as fs
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.chrome.options import Options
import os
import sys
import datetime
import urllib.request
import ssl

# 自己証明書を求められるページではエラーになるので、その対策
ssl._create_default_https_context = ssl._create_unverified_context

dc = DesiredCapabilities.CHROME.copy()
dc['acceptSslCerts'] = True
dc['goog:loggingPrefs'] = {'browser': 'ALL'}
chrome_service = fs.Service(executable_path = 'chromedriver')

# GUIを表示しないヘッドレスモードを設定する方法
options = Options()
options.add_argument('--headless')
browser = webdriver.Chrome(service = chrome_service, desired_capabilities = dc, options = options)

# スクレイピングするURLの一覧
urls = [
        'https://www.yahoo.co.jp/',
        'https://www.google.com/'
]

for url in urls:
    # レスポンスコードを取得
    req = urllib.request.Request(url)
    try:
            urllib.request.urlopen(req)
    except urllib.error.HTTPError as e:
            print(url, 'status', e.code)
            continue

    # コンソールを表示
    browser.get(url)
    for entry in browser.get_log('browser'):
        print(url, entry)

    # スクショをとる
    w = browser.execute_script('return document.body.scrollWidth;')
    h = browser.execute_script('return document.body.scrollHeight;')
    browser.set_window_size(w, h)
    dt_now = datetime.datetime.now()
    file_name = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'images/'+dt_now.strftime('%Y%m%d%H%M%S')+'.png')
    browser.save_screenshot(file_name)

# ブラウザを閉じる
browser.close()

最後に

ざっとSeleniumについて紹介しました。

もし興味を持ったかたやコピペしたけど動かない等があれば、JGまで遠慮なく聞いてください。

この記事を書いた人

入社年2020年

出身地神奈川県

業務内容プログラム

特技・趣味サッカー観戦

JGの記事一覧へ

テクログに関する記事一覧

テクログ

Flutter3のMaterial3をさわってみた

2022.05.16
テクログ

Unitテストについて③アサーションとアノテーション

2021.10.29
テクログ

Objective-C ｜非同期通信簡単バージョン

2014.06.04
テクログ

AWSのssmってコスパがいいと思う

2019.10.24

STAFF BLOG

スタッフブログ

TECHNICAL