Posted 2019-07-31Updated 2020-10-25Python

记一次写刷分程序

假期回到家里接到老妈的需求，帮忙网课刷分
~~好孩子不要学~~

好嘞，写段js贼容易

发现这个网站登录需要app扫码

emmm，app扫码登录是怎么实现的呢？
经过我的一波搜索，大致明白了过程
1.服务器生成二维码，我们使用解码工具可以看到，里面是一个网址加上一段字符作为参数，这个参数也是由服务器生成的，用来标识用户的唯一id
2.手机登录状态下，扫描二维码会访问这个url，服务器将该（登录）用户与这个id绑定
3.网页客户端轮询问服务器是否确定了id和用户，服务器一但绑定完成后接到请求，变给网页端返回该用户的信息

~~看起来不困难回来弄个~~

我使用Python自动化测试的selenium来模拟浏览器操作
需求是：扫描二维码登录，然后随机点6篇文章，模拟阅读，然后退出

首先也尝试过使用request来获取二维码，但是发现不行，经过排查发现这个二维码是js动态生成的，原本的html中是没有的，难受了

只好使用webdriver顺带获取了

先贴代码，这个是第一版的，以后可能会有优化


# -*- coding:utf-8 -*-
from selenium import webdriver
import random
import time
import base64
from PIL import Image, ImageTk
import tkinter as tk
from io import BytesIO
options = webdriver.FirefoxOptions()
import os
from multiprocessing import Process, Queue
# options.add_argument('-headless')
# options.add_argument('--disable-gpu')

url = "https://pc.xuexi.cn/points/login.html?ref=https%3A%2F%2Fwww.xuexi.cn%2Fd184e7597cc0da16f5d9f182907f1200%2F9a3668c13f6e303932b5e0e100fc248b.html"

def save_in_mem_qrcode(Qrcode):
    f = BytesIO()
    Qrcode = base64.b64decode(bytes(Qrcode, "utf-8"))
    f.write(Qrcode)
    return f

def save_in_file_qrcode(Qrcode):
    Qrcode = base64.b64decode(bytes(Qrcode, "utf-8"))
    fp = open("qr-code.png", 'wb+')
    fp.write(Qrcode)
    fp.close()

class AutoLearn():
    def __init__(self):
        self.ids = random.sample([x for x in range(1,20)], 6)
        self.passages_url = "https://www.xuexi.cn/d184e7597cc0da16f5d9f182907f1200/9a3668c13f6e303932b5e0e100fc248b.html"
        print(self.ids)
        self.driver = webdriver.Firefox(firefox_options=options)
        self.driver.maximize_window()

    def auto_read(self):
        self.scan_qrcode()
        self.driver.switch_to_default_content()
        try:
            self.driver.get(self.passages_url)
        except Exception as e:
            print(e)
        else:
            self.driver.implicitly_wait(30)

            for i in self.ids:
                try:
                    self.driver.find_elements_by_xpath('//div[@class="text-link-item-title"]/div[1]')[i].click()
                    self.driver.implicitly_wait(30)
                    main_handle = self.driver.current_window_handle
                    handles = self.driver.window_handles
                    self.driver.switch_to.window(handles[1])
                    print(self.driver.current_url)
                    height = 0
                    i = 0
                    while i < 8:
                        self.driver.execute_script("window.scrollTo(100, {});".format(height))
                        time.sleep(random.randint(10, 20))
                        height += random.randint(200, 250)
                        i += 1
                    time.sleep(10)
                    self.driver.execute_script("window.scrollTo(100, document.body.scrollHeight);")
                    time.sleep(10)
                    self.driver.close()
                except Exception as e:
                    print(e)
                    self.driver.close()
                else:
                    self.driver.switch_to.window(main_handle)


    def scan_qrcode(self):
        try:
            self.driver.get(url)
        except Exception as e:
            print(e)
        else:
            self.driver.implicitly_wait(60)
            self.driver.switch_to.frame('ddlogin-iframe')
            src = self.driver.find_element_by_xpath("//*[@id='qrcode']/img").get_attribute('src')
            qrcode = src.split(',')[1]

            p = Process(target=QR_GUI, args=(qrcode, ))
            p.start()
            p.join()
            # save_in_file_qrcode(qrcode)
            # path = "qr-code.png"

    def __del__(self):
         self.driver.close()



def QR_GUI(code):
    root = tk.Tk()
    root.wm_resizable(False, False)  # 不允许调整宽高
    root.title("扫描二维表登录")
    fp = save_in_mem_qrcode(code)
    img_open = Image.open(fp)
    img_png = ImageTk.PhotoImage(img_open)
    label_img = tk.Label(root, image=img_png)
    label_img.pack()
    root.mainloop()

if __name__ == '__main__':
    # q = Queue()
    t = AutoLearn()
    t.auto_read()

不知道为啥使用无头的时候不行，必须使用有界面的
当然遇见了不少问题，下面罗列供以后参考：

如何跳转到新建页面？

其实每个页面都有一个叫handle的参数，来唯一标识这个页面，我们使用handles = driver.window_handles来获取目前浏览器的所有页面,然后driver.switch_to.window(handles[x])来获取第x个页面

如何实现拖动滚动条？

driver.execute_script("window.scrollTo(100, {});".format(height))
ps：这里我前端学到不扎实，忘了一些大小的计算关系，源码这个滚动条是随便指定位置的
当然，我们直接执行driver.execute_script("window.scrollTo(100, document.body.scrollHeight);")就可以跳转到页面底部

怎么往Tkinter中加入图片啊？

from PIL import Image,ImageTk
你需要pillow库支持，如下面的代码

#创建一个图片管理类
img_open = Image.open(fp)
img_png = ImageTk.PhotoImage(img_open)
label_img = tk.Label(root, image=img_png)

这里fp是图片的文件路径，也可以直接在PhotoImage中指定file=xxx而省略第一步。不同的是我是把图片写在内存里了，这里，Image.open的模式不要指定，如果指定必须为’r’（不要因为是二进制就自己写个‘rb’）

我遇见了bug！

我试图driver.get("xxx")时报错：

InvalidArgumentException: Message: Malformed URL: can’t access dead object

问题所在：
I think you switched into a frame just before .get(). And you cannot open an url in the frame.（来自Stack Overflow）
解决：driver.switch_to_default_content()

我看见了一个神奇的img标签！

如下：
<img alt=“Scan me!” style=“display: block;” src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAANIAAADSCAYAAAA/…

这里是把图片的base64编码写在了html中，你可尝试复制下来图片，打开使用base64编码，会发现是一样的。这个被称为Data URI scheme
这个目的是将一些小的数据，直接嵌入到网页中，从而不用再从外部文件载入，减少服务器开销，毕竟我们把图像文件的内容直接写在了HTML文件中，节省了一个HTTP请求。坏处呢，就是浏览器不会缓存这种图像。

base64简单地说，它把一些 8-bit 数据翻译成标准 ASCII 字符

路径对啊，怎么访问不到?

可能是frame的问题：
在web 应用中经常会遇到frame 嵌套页面的应用，页WebDriver 每次只能在一个页面上识别元素，对于frame 嵌套内的页面上的元素，直接定位是定位是定位不到的。这个时候就需要通过switch_to.frame()方法将当前定位的主体切换了frame 里。

写在后面

这个程序还有很多地方待完善，功能也不是很全

马上就更汇编~~咕咕咕~~

记一次写刷分程序

http://cyx0706.github.io/2019/07/31/auto-learnVer1/

Author

Ctwo

Posted on

2019-07-31

Updated on

2020-10-25

Licensed under

#Web Crawler automation

记一次写刷分程序

如何跳转到新建页面？

如何实现拖动滚动条？

怎么往Tkinter中加入图片啊？

我遇见了bug！

我看见了一个神奇的img标签！

路径对啊，怎么访问不到?

写在后面

Author

Posted on

Updated on

Licensed under

Comments

Catalogue