pandas 合并两个表,如何保留第一个表的索引?

李魔佛 发表了文章 • 0 个评论 • 100 次浏览 • 2021-05-29 22:17 • 来自相关话题

df1 数据
tickerBond closePriceBond bondPremRatio secShortNameBond tickerEqu \
secID
110066 110066 199.94 -1.2442 盛屯转债 600711
110067 110067 119.53 25.9204 华安转债 600909
113021 113021 105.81 45.0989 中信转债 601998
113024 113024 101.94 36.6668 核建转债 601611
113025 113025 129.16 0.0409 明泰转债 601677
df2 数据
ROE tickerEqu
0 2.642931 600711
1 4.425438 600909
2 6.259092 601998
3 4.432315 601611
4 6.454054 601677
如果按照 pd.merge(df1,df2,on='tickerEqu') ,按照列 tickerEqu 进行合并,这样会导致最后合成的新的列的索性重构,变成 0,1,2,3 这种的。

有什么办法可以保留 df1 的索引? 用 join 的话会报错,因为 df2 的索引和 df1 匹配不上。
 
先 df1 = df1.reset_index(),合并之后再把 secID 那一列设为 index 。 查看全部
df1 数据
 tickerBond  closePriceBond  bondPremRatio secShortNameBond tickerEqu  \
secID
110066 110066 199.94 -1.2442 盛屯转债 600711
110067 110067 119.53 25.9204 华安转债 600909
113021 113021 105.81 45.0989 中信转债 601998
113024 113024 101.94 36.6668 核建转债 601611
113025 113025 129.16 0.0409 明泰转债 601677

df2 数据
        ROE tickerEqu
0 2.642931 600711
1 4.425438 600909
2 6.259092 601998
3 4.432315 601611
4 6.454054 601677

如果按照 pd.merge(df1,df2,on='tickerEqu') ,按照列 tickerEqu 进行合并,这样会导致最后合成的新的列的索性重构,变成 0,1,2,3 这种的。

有什么办法可以保留 df1 的索引? 用 join 的话会报错,因为 df2 的索引和 df1 匹配不上。
 
先 df1 = df1.reset_index(),合并之后再把 secID 那一列设为 index 。

pip install peewee : AttributeError: 'str' object has no attribute 'decode'

李魔佛 发表了文章 • 0 个评论 • 224 次浏览 • 2021-04-26 23:58 • 来自相关话题

 
ERROR: Command errored out with exit status 1:
command: 'C:\anaconda\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\xda\\AppData\
\Local\\Temp\\pip-install-ftotbzih\\peewee\\setup.py'"'"'; __file__='"'"'C:\\Users\\xda\\AppData\\Local\\Temp\\pip-insta
ll-ftotbzih\\peewee\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"
'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\xda\AppData\Lo
cal\Temp\pip-pip-egg-info-8ou7yi3i'
cwd: C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\
Complete output (15 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\setup.py", line 99, in <module>
elif not _have_sqlite_extension_support():
File "C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\setup.py", line 76, in _have_sqlite_extension_su
pport
compiler.compile([src_file], output_dir=tmp_dir),
File "C:\anaconda\lib\distutils\_msvccompiler.py", line 327, in compile
self.initialize()
File "C:\anaconda\lib\distutils\_msvccompiler.py", line 224, in initialize
vc_env = _get_vc_env(plat_spec)
File "C:\anaconda\lib\site-packages\setuptools\msvc.py", line 314, in msvc14_get_vc_env
return _msvc14_get_vc_env(plat_spec)
File "C:\anaconda\lib\site-packages\setuptools\msvc.py", line 273, in _msvc14_get_vc_env
out = subprocess.check_output(
AttributeError: 'str' object has no attribute 'decode'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
同样的编码问题,同样的解决方法:
找到文件msvc.py
大概在276行:
out = subprocess.check_output(
'cmd /u /c "{}" {} && set'.format(vcvarsall, plat_spec),
stderr=subprocess.STDOUT,
)
# ).decode('utf-16le', errors='replace')把decode的部分注释掉即可
  查看全部
 
    ERROR: Command errored out with exit status 1:
command: 'C:\anaconda\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\xda\\AppData\
\Local\\Temp\\pip-install-ftotbzih\\peewee\\setup.py'"'"'; __file__='"'"'C:\\Users\\xda\\AppData\\Local\\Temp\\pip-insta
ll-ftotbzih\\peewee\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"
'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\xda\AppData\Lo
cal\Temp\pip-pip-egg-info-8ou7yi3i'
cwd: C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\
Complete output (15 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\setup.py", line 99, in <module>
elif not _have_sqlite_extension_support():
File "C:\Users\xda\AppData\Local\Temp\pip-install-ftotbzih\peewee\setup.py", line 76, in _have_sqlite_extension_su
pport
compiler.compile([src_file], output_dir=tmp_dir),
File "C:\anaconda\lib\distutils\_msvccompiler.py", line 327, in compile
self.initialize()
File "C:\anaconda\lib\distutils\_msvccompiler.py", line 224, in initialize
vc_env = _get_vc_env(plat_spec)
File "C:\anaconda\lib\site-packages\setuptools\msvc.py", line 314, in msvc14_get_vc_env
return _msvc14_get_vc_env(plat_spec)
File "C:\anaconda\lib\site-packages\setuptools\msvc.py", line 273, in _msvc14_get_vc_env
out = subprocess.check_output(
AttributeError: 'str' object has no attribute 'decode'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

同样的编码问题,同样的解决方法:
找到文件msvc.py
大概在276行:
        out = subprocess.check_output(
'cmd /u /c "{}" {} && set'.format(vcvarsall, plat_spec),
stderr=subprocess.STDOUT,
)
# ).decode('utf-16le', errors='replace')
把decode的部分注释掉即可
 

百度英语中文语音识别为文字服务 python demo【使用requests重写官方demo】

李魔佛 发表了文章 • 0 个评论 • 260 次浏览 • 2021-04-25 11:49 • 来自相关话题

 
官方使用的稍微底层的urllib写的,用过了requests库的人看着不习惯。故重写之,并做了封装。
官方demo:
https://github.com/Baidu-AIP/speech-demo
# -*- coding: utf-8 -*-
# @Time : 2021/4/24 20:50
# @File : baidu_voice_service.py
# @Author : Rocky C@www.30daydo.com

import os
import requests
import sys
import pickle
sys.path.append('..')
from config import API_KEY,SECRET_KEY
from base64 import b64encode
from pathlib import PurePath


BASE = PurePath(__file__).parent

# 需要识别的文件
AUDIO_FILE = r'C:\OtherGit\speech-demo\rest-api-asr\python\audio\2.m4a' # 只支持 pcm/wav/amr 格式,极速版额外支持m4a 格式
# 文件格式
FORMAT = AUDIO_FILE[-3:] # 文件后缀只支持 pcm/wav/amr 格式,极速版额外支持m4a 格式

CUID = '24057753'
# 采样率
RATE = 16000 # 固定值


ASR_URL = 'http://vop.baidu.com/server_api'

#测试自训练平台需要打开以下信息, 自训练平台模型上线后,您会看见 第二步:“”获取专属模型参数pid:8001,modelid:1234”,按照这个信息获取 dev_pid=8001,lm_id=1234
'''
http://vop.baidu.com/server_api
1537 普通话(纯中文识别) 输入法模型 有标点 支持自定义词库
1737 英语 英语模型 无标点 不支持自定义词库
1637 粤语 粤语模型 有标点 不支持自定义词库
1837 四川话 四川话模型 有标点 不支持自定义词库
1936 普通话远场
'''
DEV_PID = 1737

SCOPE = 'brain_enhanced_asr' # 有此scope表示有asr能力,没有请在网页里开通极速版


class DemoError(Exception):
pass


TOKEN_URL = 'http://openapi.baidu.com/oauth/2.0/token'

def fetch_token():

params = {'grant_type': 'client_credentials',
'client_id': API_KEY,
'client_secret': SECRET_KEY}
r = requests.post(
url=TOKEN_URL,
data=params
)

result = r.json()
if ('access_token' in result.keys() and 'scope' in result.keys()):
if SCOPE and (not SCOPE in result['scope'].split(' ')): # SCOPE = False 忽略检查
raise DemoError('scope is not correct')

return result['access_token']

else:
raise DemoError('MAYBE API_KEY or SECRET_KEY not correct: access_token or scope not found in token response')


""" TOKEN end """

def dump_token(token):
with open(os.path.join(BASE,'token.pkl'),'wb') as fp:
pickle.dump({'token':token},fp)

def load_token(filename):

if not os.path.exists(filename):
token=fetch_token()
dump_token(token)
return token
else:
with open(filename,'rb') as fp:
token = pickle.load(fp)
return token['token']

def recognize_service(token,filename):

with open(filename, 'rb') as speech_file:
speech_data = speech_file.read()

length = len(speech_data)
if length == 0:
raise DemoError('file %s length read 0 bytes' % AUDIO_FILE)

b64_data = b64encode(speech_data)
params = {'cuid': CUID, 'token': token, 'dev_pid': DEV_PID,'speech':b64_data,'len':length,'format':FORMAT,'rate':RATE,'channel':1}

headers = {
'Content-Type':'application/json',
}
r = requests.post(url=ASR_URL,json=params,headers=headers)
return r.json()


def main():
filename = 'token.pkl'
token = load_token(filename)
result = recognize_service(token,AUDIO_FILE)
print(result['result'])

if __name__ == '__main__':
main()



只需要替换自己的key就可以使用。
自己录了几段英文测试了下,还是蛮准的。 查看全部
 
官方使用的稍微底层的urllib写的,用过了requests库的人看着不习惯。故重写之,并做了封装。
官方demo:
https://github.com/Baidu-AIP/speech-demo
# -*- coding: utf-8 -*-
# @Time : 2021/4/24 20:50
# @File : baidu_voice_service.py
# @Author : Rocky C@www.30daydo.com

import os
import requests
import sys
import pickle
sys.path.append('..')
from config import API_KEY,SECRET_KEY
from base64 import b64encode
from pathlib import PurePath


BASE = PurePath(__file__).parent

# 需要识别的文件
AUDIO_FILE = r'C:\OtherGit\speech-demo\rest-api-asr\python\audio\2.m4a' # 只支持 pcm/wav/amr 格式,极速版额外支持m4a 格式
# 文件格式
FORMAT = AUDIO_FILE[-3:] # 文件后缀只支持 pcm/wav/amr 格式,极速版额外支持m4a 格式

CUID = '24057753'
# 采样率
RATE = 16000 # 固定值


ASR_URL = 'http://vop.baidu.com/server_api'

#测试自训练平台需要打开以下信息, 自训练平台模型上线后,您会看见 第二步:“”获取专属模型参数pid:8001,modelid:1234”,按照这个信息获取 dev_pid=8001,lm_id=1234
'''
http://vop.baidu.com/server_api
1537 普通话(纯中文识别) 输入法模型 有标点 支持自定义词库
1737 英语 英语模型 无标点 不支持自定义词库
1637 粤语 粤语模型 有标点 不支持自定义词库
1837 四川话 四川话模型 有标点 不支持自定义词库
1936 普通话远场
'''
DEV_PID = 1737

SCOPE = 'brain_enhanced_asr' # 有此scope表示有asr能力,没有请在网页里开通极速版


class DemoError(Exception):
pass


TOKEN_URL = 'http://openapi.baidu.com/oauth/2.0/token'

def fetch_token():

params = {'grant_type': 'client_credentials',
'client_id': API_KEY,
'client_secret': SECRET_KEY}
r = requests.post(
url=TOKEN_URL,
data=params
)

result = r.json()
if ('access_token' in result.keys() and 'scope' in result.keys()):
if SCOPE and (not SCOPE in result['scope'].split(' ')): # SCOPE = False 忽略检查
raise DemoError('scope is not correct')

return result['access_token']

else:
raise DemoError('MAYBE API_KEY or SECRET_KEY not correct: access_token or scope not found in token response')


""" TOKEN end """

def dump_token(token):
with open(os.path.join(BASE,'token.pkl'),'wb') as fp:
pickle.dump({'token':token},fp)

def load_token(filename):

if not os.path.exists(filename):
token=fetch_token()
dump_token(token)
return token
else:
with open(filename,'rb') as fp:
token = pickle.load(fp)
return token['token']

def recognize_service(token,filename):

with open(filename, 'rb') as speech_file:
speech_data = speech_file.read()

length = len(speech_data)
if length == 0:
raise DemoError('file %s length read 0 bytes' % AUDIO_FILE)

b64_data = b64encode(speech_data)
params = {'cuid': CUID, 'token': token, 'dev_pid': DEV_PID,'speech':b64_data,'len':length,'format':FORMAT,'rate':RATE,'channel':1}

headers = {
'Content-Type':'application/json',
}
r = requests.post(url=ASR_URL,json=params,headers=headers)
return r.json()


def main():
filename = 'token.pkl'
token = load_token(filename)
result = recognize_service(token,AUDIO_FILE)
print(result['result'])

if __name__ == '__main__':
main()



只需要替换自己的key就可以使用。
自己录了几段英文测试了下,还是蛮准的。

本地代码 搜索脚本 python实现

李魔佛 发表了文章 • 0 个评论 • 230 次浏览 • 2021-04-14 19:34 • 来自相关话题

本来用find+grep可以搞定的,不过如果搜索多个路径和多个规则,写正则可能写过不来
find . -type f -name "*.py" | xargs grep "redis"
上面语句是在py文件中查找redis的字符。
 
 不过如果要在指定多个位置查找,可能要拼接几个管道,并且如果我要几个字符的关系是并集,就是多个关键字要在文本中同时出现,而且不一定在同一行,所以也不好写。
 
所以写了个python脚本,也方便在centos下运行
# -*- coding: utf-8 -*-
# @Time : 2021/4/14 1:46
# @File : search_string_in_folder.py
# @Author : Rocky C@www.30daydo.com

'''
搜索代码脚本
'''
import fire
import glob
import re

# TODO 用PYQT重写一个

PATH_LIST = [r'C:\git\\',r'C:\OtherGit\\',r'C:\OneDrive\viewed_code\\']
POST_FIX = 'py' # 后缀文件
# 关键词
WORDS=[]

EXCLUDE_PATH=[r'C:\OtherGit\cpython']

DEBUG = True

class FileSearcher:

def __init__(self,kw):
self.root_path_list = PATH_LIST
self.default_coding ='utf-8'
self.exception_handle_coding='gbk'
self.kw=[]
if not isinstance(kw,tuple):
kw=(kw,)

for k in kw:
k=k.strip()
self.kw.append(k)

def search(self,file,encoding):
match_dict = dict()

for w in self.kw:
match_dict.setdefault(w, False)

line_number = 0
line_list=list()
with open(file, 'r', encoding=encoding) as fp:

while 1:
try:
line = fp.readline()

except UnicodeDecodeError as e:

if DEBUG:
print(f'Error coding in file {file}')
print(e)

return None,None,None

except Exception as e:
if DEBUG:
print(f'Error in file {file}')
print(e)
break

if not line:
break

line = line.strip()
if not line:
continue

for w in self.kw:
m=re.search(w,line,re.IGNORECASE)
if m:
match_dict.update({w:True})
line_list.append(line_number)

line_number+=1

return True,match_dict.copy(),line_list.copy()

def print_match_result(self,file,line_list,encoding):

with open(file, 'r', encoding=encoding) as fp:
line_number = 0
while 1:
try:
line = fp.readline()
except Exception as e:
if DEBUG:
print(f'Error in file {file}')
print(e)
break

if not line:
break
line=line.strip()

if not line:
continue

if line_number in line_list:
print(f'{file} :: {line_number} ====>\n {line[:50]}\n')

line_number += 1

def run(self):
for path in self.root_path_list:

search_path=path+'**/*.'+POST_FIX

for file in glob.iglob(search_path,recursive=True):

for ex_path in EXCLUDE_PATH:
ex_path=ex_path.replace('\\','')
temp_file=file.replace('\\','')
if ex_path in temp_file:
continue

use_encoding=self.default_coding
encode_proper,match_dict,line_list=self.search(file,use_encoding)

if not encode_proper:
use_encoding = self.exception_handle_coding
encode_proper,match_dict,line_list=self.search(file, use_encoding)

if match_dict is not None and len(match_dict)>0 and all(match_dict.values()):
# print(match_dict.values())
self.print_match_result(file,line_list,use_encoding)
# print(line_list)


def test_error_file():
path=r'C:\git\CodePool\example-code\19-dyn-attr-prop\oscon\schedule2.py'
with open(path,'r',encoding='utf8') as fp:
while 1:
x=fp.readline()
if not x:
break
print(x)

def main(kw):
app = FileSearcher(kw)
app.run()

if __name__ == '__main__':
fire.Fire(main)

运行: python main.py --kw=asyncio,gather
 





  查看全部
本来用find+grep可以搞定的,不过如果搜索多个路径和多个规则,写正则可能写过不来
find . -type f -name "*.py" | xargs grep "redis"

上面语句是在py文件中查找redis的字符。
 
 不过如果要在指定多个位置查找,可能要拼接几个管道,并且如果我要几个字符的关系是并集,就是多个关键字要在文本中同时出现,而且不一定在同一行,所以也不好写。
 
所以写了个python脚本,也方便在centos下运行
# -*- coding: utf-8 -*-
# @Time : 2021/4/14 1:46
# @File : search_string_in_folder.py
# @Author : Rocky C@www.30daydo.com

'''
搜索代码脚本
'''
import fire
import glob
import re

# TODO 用PYQT重写一个

PATH_LIST = [r'C:\git\\',r'C:\OtherGit\\',r'C:\OneDrive\viewed_code\\']
POST_FIX = 'py' # 后缀文件
# 关键词
WORDS=[]

EXCLUDE_PATH=[r'C:\OtherGit\cpython']

DEBUG = True

class FileSearcher:

def __init__(self,kw):
self.root_path_list = PATH_LIST
self.default_coding ='utf-8'
self.exception_handle_coding='gbk'
self.kw=[]
if not isinstance(kw,tuple):
kw=(kw,)

for k in kw:
k=k.strip()
self.kw.append(k)

def search(self,file,encoding):
match_dict = dict()

for w in self.kw:
match_dict.setdefault(w, False)

line_number = 0
line_list=list()
with open(file, 'r', encoding=encoding) as fp:

while 1:
try:
line = fp.readline()

except UnicodeDecodeError as e:

if DEBUG:
print(f'Error coding in file {file}')
print(e)

return None,None,None

except Exception as e:
if DEBUG:
print(f'Error in file {file}')
print(e)
break

if not line:
break

line = line.strip()
if not line:
continue

for w in self.kw:
m=re.search(w,line,re.IGNORECASE)
if m:
match_dict.update({w:True})
line_list.append(line_number)

line_number+=1

return True,match_dict.copy(),line_list.copy()

def print_match_result(self,file,line_list,encoding):

with open(file, 'r', encoding=encoding) as fp:
line_number = 0
while 1:
try:
line = fp.readline()
except Exception as e:
if DEBUG:
print(f'Error in file {file}')
print(e)
break

if not line:
break
line=line.strip()

if not line:
continue

if line_number in line_list:
print(f'{file} :: {line_number} ====>\n {line[:50]}\n')

line_number += 1

def run(self):
for path in self.root_path_list:

search_path=path+'**/*.'+POST_FIX

for file in glob.iglob(search_path,recursive=True):

for ex_path in EXCLUDE_PATH:
ex_path=ex_path.replace('\\','')
temp_file=file.replace('\\','')
if ex_path in temp_file:
continue

use_encoding=self.default_coding
encode_proper,match_dict,line_list=self.search(file,use_encoding)

if not encode_proper:
use_encoding = self.exception_handle_coding
encode_proper,match_dict,line_list=self.search(file, use_encoding)

if match_dict is not None and len(match_dict)>0 and all(match_dict.values()):
# print(match_dict.values())
self.print_match_result(file,line_list,use_encoding)
# print(line_list)


def test_error_file():
path=r'C:\git\CodePool\example-code\19-dyn-attr-prop\oscon\schedule2.py'
with open(path,'r',encoding='utf8') as fp:
while 1:
x=fp.readline()
if not x:
break
print(x)

def main(kw):
app = FileSearcher(kw)
app.run()

if __name__ == '__main__':
fire.Fire(main)

运行: python main.py --kw=asyncio,gather
 

mQm5aIvTh1.png

 

判读一个函数是不是协程

李魔佛 发表了文章 • 0 个评论 • 182 次浏览 • 2021-04-09 20:27 • 来自相关话题

传入的是函数名,不需要加入括号:
def check_coroutine(fun):
if iscoroutinefunction(fun):
print('是协程')
else:
print('不是协程')

async def visit_web():
browser = await pyppeteer.launch(
{'headless': False,
'userDataDir': UserDataDir,
'defaultViewport': {'width': 1800, 'height': 1000},
'ignoreDefaultArgs':True,
}
)
page = await browser.newPage()

# 可以在launch下配置
# await page.setViewport({
# "width": 1900,
# "height": 1020
# })


# 先执行下面的JS 再去goto
await page.evaluate(
'''() =>{ Object.defineProperties(navigator,{ webdriver:{ get: () => false } }) }''')
# await page.screenshot({'path': 'test.png', 'fullPage': True})
# await page.pdf({'path': 'test.pdf'})
# await asyncio.sleep(5)

await page.goto(url=URL)

# 这里的js是异步的写法
dimensions = await page.evaluate(
'''
()=>{
return {
width:document.documentElement.clientWidth,
height:document.documentElement.clientHeight,
deviceScaleFactor_:window.devicePixelRatio,
}
}
'''
)

result = await page.evaluate(
'''
()=>{
var title = document.title;
return {title:title};
}
'''

)


await browser.close()

调用:
check_coroutine(visit_web)注意,上面不能用visit_web()

  查看全部
传入的是函数名,不需要加入括号:
def check_coroutine(fun):
if iscoroutinefunction(fun):
print('是协程')
else:
print('不是协程')

async def visit_web():
browser = await pyppeteer.launch(
{'headless': False,
'userDataDir': UserDataDir,
'defaultViewport': {'width': 1800, 'height': 1000},
'ignoreDefaultArgs':True,
}
)
page = await browser.newPage()

# 可以在launch下配置
# await page.setViewport({
# "width": 1900,
# "height": 1020
# })


# 先执行下面的JS 再去goto
await page.evaluate(
'''() =>{ Object.defineProperties(navigator,{ webdriver:{ get: () => false } }) }''')
# await page.screenshot({'path': 'test.png', 'fullPage': True})
# await page.pdf({'path': 'test.pdf'})
# await asyncio.sleep(5)

await page.goto(url=URL)

# 这里的js是异步的写法
dimensions = await page.evaluate(
'''
()=>{
return {
width:document.documentElement.clientWidth,
height:document.documentElement.clientHeight,
deviceScaleFactor_:window.devicePixelRatio,
}
}
'''
)

result = await page.evaluate(
'''
()=>{
var title = document.title;
return {title:title};
}
'''

)


await browser.close()

调用:
check_coroutine(visit_web)
注意,上面不能用visit_web()

 

转换很多逗号的,,,,,,,,, JS的数组为python列表

李魔佛 发表了文章 • 0 个评论 • 203 次浏览 • 2021-03-29 18:54 • 来自相关话题

不知道JS的写法就是这样还是这样的,一个列表可以这么写
var arr = [,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2,3,4,5] 前面的逗号就是没有数据,None或者0.
然后JS的代码可以不填充任何数据。python要把它转为list,要怎么做的?
 
有2个方法:
 
1. 最简单,因为,,的意思是0,0, 那么我们可以把两个逗号替换成0,0,
但是如果前面的逗号数是单数,比如是3个逗号,
arr=[,,,1,2,3]
直接替换2个逗号为0,0,的话,结果是0,0,,1,2,3
结果也不对。
多了一对逗号
然后可以直接再替换一次,, 把两个的地方替换为1个,
 
2. 使用finditer找出每个多余2个逗号的起始和结束,然后替换为0, 即可。
for m in re.finditer(',{2,}'):
    start=m.start()
    end=m.end()
     查看全部
不知道JS的写法就是这样还是这样的,一个列表可以这么写
var arr = [,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2,3,4,5] 前面的逗号就是没有数据,None或者0.
然后JS的代码可以不填充任何数据。python要把它转为list,要怎么做的?
 
有2个方法:
 
1. 最简单,因为,,的意思是0,0, 那么我们可以把两个逗号替换成0,0,
但是如果前面的逗号数是单数,比如是3个逗号,
arr=[,,,1,2,3]
直接替换2个逗号为0,0,的话,结果是0,0,,1,2,3
结果也不对。
多了一对逗号
然后可以直接再替换一次,, 把两个的地方替换为1个,
 
2. 使用finditer找出每个多余2个逗号的起始和结束,然后替换为0, 即可。
for m in re.finditer(',{2,}'):
    start=m.start()
    end=m.end()
    

python 转换excel数据,适配flourish数据格式

李魔佛 发表了文章 • 0 个评论 • 359 次浏览 • 2021-02-20 00:28 • 来自相关话题

flourish可视化网站要求excel的时间是按列排的,也就是我有1000个数据,那么也就需要1000列,这个和dataframe的默认数据是转置的,也就是需要把dataframe的行变成列。
 
而在数据量很大的情况下,pandas的xlwt是不支持265行以上的,所以需要用xlsxwriter这个库,通过手动转换
 
 
import xlsxwriter #导入模块
workbook = xlsxwriter.Workbook('new_people.xlsx') #新建excel表
worksheet = workbook.add_worksheet('sheet1') #新建sheet(sheet的名称为"sheet1")
把行列重新写入。
for index,item in df.iterrows():
date=item['上市日期']
count=item['申购人数']
date=date.replace(' 00:00:00','')
worksheet.write(0,index,date)
worksheet.write(1,index,count)

workbook.close()
index就是列数,不断地写在第一行和第二行,就可以达到所要的需求了。
  查看全部
flourish可视化网站要求excel的时间是按列排的,也就是我有1000个数据,那么也就需要1000列,这个和dataframe的默认数据是转置的,也就是需要把dataframe的行变成列。
 
而在数据量很大的情况下,pandas的xlwt是不支持265行以上的,所以需要用xlsxwriter这个库,通过手动转换
 
 
import xlsxwriter   #导入模块
workbook = xlsxwriter.Workbook('new_people.xlsx') #新建excel表
worksheet = workbook.add_worksheet('sheet1') #新建sheet(sheet的名称为"sheet1")

把行列重新写入。
for index,item in df.iterrows():
date=item['上市日期']
count=item['申购人数']
date=date.replace(' 00:00:00','')
worksheet.write(0,index,date)
worksheet.write(1,index,count)

workbook.close()

index就是列数,不断地写在第一行和第二行,就可以达到所要的需求了。
 

安装nodejs后新增的python把原来的python版本覆盖了

李魔佛 发表了文章 • 0 个评论 • 778 次浏览 • 2021-01-29 14:58 • 来自相关话题

如果安装nodejs最后勾选了python环境,系统默认帮你装上最新的python版本,还自动把环境变量帮你加上,真是贴心。
 解决办法:
win10: 打开环境变量,把第一个python39或者类似字样的环境变量往下移,最好移到最后。
如果安装nodejs最后勾选了python环境,系统默认帮你装上最新的python版本,还自动把环境变量帮你加上,真是贴心。
 解决办法:
win10: 打开环境变量,把第一个python39或者类似字样的环境变量往下移,最好移到最后。

python解析windows日志文件,查询服务器是否被人攻击

李魔佛 发表了文章 • 0 个评论 • 1009 次浏览 • 2021-01-17 23:49 • 来自相关话题

最近大致浏览了下windows server的日志记录,发现有不少的异地IP进行了登录尝试,而且有部分是登录成功的,但不确定是否本人自己登陆,所以借助python,对日志进行解析,并根据IP查询其远程物理地址。
 
最终效果:








【MD,老毛子就是天天在扫描,爆破密码,即使改了端口还是在枚举】
 
大致代码如下:import mmap
import contextlib
from Evtx.Evtx import FileHeader
from Evtx.Views import evtx_file_xml_view
from xml.dom import minidom
from ip_convertor import IP
import re

class WindowsLogger():

def __init__(self,path):
self.path = path
self.formator = 'IP:{:10}\tlocation:{:20}\tUser:{:15}\tProcess:{}'

def read_file(self):
with open(self.path,'r') as f:
with contextlib.closing(mmap.mmap(f.fileno(),0,access=mmap.ACCESS_READ)) as buf:
fh = FileHeader(buf,0)
return fh

return None

def parse_log_detail(self,filteID):
with open(self.path,'r') as f:
with contextlib.closing(mmap.mmap(f.fileno(),0,access=mmap.ACCESS_READ)) as buf:
fh = FileHeader(buf,0)
for xml, record in evtx_file_xml_view(fh):
#只输出事件ID为4624的内容
# InterestEvent(xml,4624)
for IpAddress,ip,targetUsername,ProcessName in self.filter_event(xml,filteID):
print(self.formator.format(IpAddress,ip,targetUsername,ProcessName))

# 过滤掉不需要的事件,输出感兴趣的事件
def filter_event(self,xml,EventID,use_filter=True):
xmldoc = minidom.parseString(xml)
# 获取EventID节点的事件ID
collections = xmldoc.documentElement
events=xmldoc.getElementsByTagName('Event')
for evt in events:
eventId = evt.getElementsByTagName('EventID')[0].childNodes[0].data
time_create = evt.getElementsByTagName('TimeCreated')[0].getAttribute('SystemTime')
eventData = evt.getElementsByTagName('EventData')[0]

for data in eventData.getElementsByTagName('Data'):
if data.getAttribute('Name')=='IpAddress':
IpAddress=data.childNodes[0].data

if data.getAttribute('Name')=='TargetUserName':
targetUsername = data.childNodes[0].data

if data.getAttribute('Name')=='ProcessName':
ProcessName = data.childNodes[0].data

if use_filter is True and eventId==EventID:
ip=''
if re.search('^\d+',IpAddress):
ip = IP(IpAddress).ip_address

yield IpAddress,ip,targetUsername,ProcessName

def main():
path=r'D:\share\1.evtx'
filter_id = '4624'
app = WindowsLogger(path)
app.parse_log_detail(filter_id)

if __name__ == '__main__':
main()
D:\share\1.evtx 为日志导出文件

原创文章,转载请注明出处:
http://30daydo.com/article/44130 
 
完整代码,可以通过公众号回复: windows日志解析获取
 

  查看全部
最近大致浏览了下windows server的日志记录,发现有不少的异地IP进行了登录尝试,而且有部分是登录成功的,但不确定是否本人自己登陆,所以借助python,对日志进行解析,并根据IP查询其远程物理地址。
 
最终效果:




cmd_vKiBIjQLpd.png

【MD,老毛子就是天天在扫描,爆破密码,即使改了端口还是在枚举】
 
大致代码如下:
import mmap
import contextlib
from Evtx.Evtx import FileHeader
from Evtx.Views import evtx_file_xml_view
from xml.dom import minidom
from ip_convertor import IP
import re

class WindowsLogger():

def __init__(self,path):
self.path = path
self.formator = 'IP:{:10}\tlocation:{:20}\tUser:{:15}\tProcess:{}'

def read_file(self):
with open(self.path,'r') as f:
with contextlib.closing(mmap.mmap(f.fileno(),0,access=mmap.ACCESS_READ)) as buf:
fh = FileHeader(buf,0)
return fh

return None

def parse_log_detail(self,filteID):
with open(self.path,'r') as f:
with contextlib.closing(mmap.mmap(f.fileno(),0,access=mmap.ACCESS_READ)) as buf:
fh = FileHeader(buf,0)
for xml, record in evtx_file_xml_view(fh):
#只输出事件ID为4624的内容
# InterestEvent(xml,4624)
for IpAddress,ip,targetUsername,ProcessName in self.filter_event(xml,filteID):
print(self.formator.format(IpAddress,ip,targetUsername,ProcessName))

# 过滤掉不需要的事件,输出感兴趣的事件
def filter_event(self,xml,EventID,use_filter=True):
xmldoc = minidom.parseString(xml)
# 获取EventID节点的事件ID
collections = xmldoc.documentElement
events=xmldoc.getElementsByTagName('Event')
for evt in events:
eventId = evt.getElementsByTagName('EventID')[0].childNodes[0].data
time_create = evt.getElementsByTagName('TimeCreated')[0].getAttribute('SystemTime')
eventData = evt.getElementsByTagName('EventData')[0]

for data in eventData.getElementsByTagName('Data'):
if data.getAttribute('Name')=='IpAddress':
IpAddress=data.childNodes[0].data

if data.getAttribute('Name')=='TargetUserName':
targetUsername = data.childNodes[0].data

if data.getAttribute('Name')=='ProcessName':
ProcessName = data.childNodes[0].data

if use_filter is True and eventId==EventID:
ip=''
if re.search('^\d+',IpAddress):
ip = IP(IpAddress).ip_address

yield IpAddress,ip,targetUsername,ProcessName

def main():
path=r'D:\share\1.evtx'
filter_id = '4624'
app = WindowsLogger(path)
app.parse_log_detail(filter_id)

if __name__ == '__main__':
main()

D:\share\1.evtx 为日志导出文件

原创文章,转载请注明出处:
http://30daydo.com/article/44130 
 
完整代码,可以通过公众号回复: windows日志解析获取
 

 

茅台抢购程序 京东 苏宁

李魔佛 发表了文章 • 0 个评论 • 4450 次浏览 • 2021-01-05 22:34 • 来自相关话题

最近掀起了茅台抢购风,所以分享一个python抢购脚本。
运行环境 windows,linux,mac,python3+
 
京东小白分查询:
https://plus.m.jd.com/rights/windControl
分太低的就不要参与了,毕竟概率会小很多
 
############ 2021-01-13 更新 ======
最新的用Go重写的,搞了几瓶










 
苏宁家的:





 


============= 2021-01-11 更新 ============

感觉苏宁的抢购是耍猴的,那个按钮基本处于不可点状态,所以就放弃了,感觉官方就是没放多少量,加上苏宁公司过往的尿性,所以洗洗睡了 


main.pyimport sys

from maotai.jd_spider_requests import ProdectPurchase


if __name__ == '__main__':
tip = """
功能列表:
1.预约商品
2.秒杀抢购商品
"""
print(tip)

product = ProdectPurchase()
choice_function = input('请选择:')
if choice_function == '1':
product.reserve()
elif choice_function == '2':
product.seckill_by_proc_pool()
else:
print('没有此功能')
sys.exit(1)







jd_spider_requests.pyimport random
import time
import requests
import functools
import json
import os
import pickle

from lxml import etree

from error.exception import SKException
from maotai.jd_logger import logger
from maotai.timer import Timer
from maotai.config import global_config
from concurrent.futures import ProcessPoolExecutor
from helper.jd_helper import (
parse_json,
send_wechat,
wait_some_time,
response_status,
save_image,
open_image
)


class SpiderSession:
"""
Session相关操作
"""

def __init__(self):
self.cookies_dir_path = "./cookies/"
self.user_agent = global_config.getRaw('config', 'DEFAULT_USER_AGENT')

self.session = self._init_session()

def _init_session(self):
session = requests.session()
session.headers = self.get_headers()
return session

def get_headers(self):
return {"User-Agent": self.user_agent,
"Accept": "text/html,application/xhtml+xml,application/xml;"
"q=0.9,image/webp,image/apng,*/*;"
"q=0.8,application/signed-exchange;"
"v=b3",
"Connection": "keep-alive"}

def get_user_agent(self):
return self.user_agent

def get_session(self):
"""
获取当前Session
:return:
"""
return self.session

def get_cookies(self):
"""
获取当前Cookies
:return:
"""
return self.get_session().cookies

def set_cookies(self, cookies):
self.session.cookies.update(cookies)

def load_cookies_from_local(self):
"""
从本地加载Cookie
:return:
"""
cookies_file = ''
if not os.path.exists(self.cookies_dir_path):
return False
for name in os.listdir(self.cookies_dir_path):
if name.endswith(".cookies"):
cookies_file = '{}{}'.format(self.cookies_dir_path, name)
break
if cookies_file == '':
return False
with open(cookies_file, 'rb') as f:
local_cookies = pickle.load(f)
self.set_cookies(local_cookies)

def save_cookies_to_local(self, cookie_file_name):
"""
保存Cookie到本地
:param cookie_file_name: 存放Cookie的文件名称
:return:
"""
cookies_file = '{}{}.cookies'.format(self.cookies_dir_path, cookie_file_name)
directory = os.path.dirname(cookies_file)
if not os.path.exists(directory):
os.makedirs(directory)
with open(cookies_file, 'wb') as f:
pickle.dump(self.get_cookies(), f)


class QrLogin:
"""
扫码登录
"""

def __init__(self, spider_session: SpiderSession):
"""
初始化扫码登录
大致流程:
1、访问登录二维码页面,获取Token
2、使用Token获取票据
3、校验票据
:param spider_session:
"""
self.qrcode_img_file = 'qr_code.png'

self.spider_session = spider_session
self.session = self.spider_session.get_session()

self.is_login = False
self.refresh_login_status()

def refresh_login_status(self):
"""
刷新是否登录状态
:return:
"""
self.is_login = self._validate_cookies()

def _validate_cookies(self):
"""
验证cookies是否有效(是否登陆)
通过访问用户订单列表页进行判断:若未登录,将会重定向到登陆页面。
:return: cookies是否有效 True/False
"""
url = 'https://order.jd.com/center/list.action'
payload = {
'rid': str(int(time.time() * 1000)),
}
try:
resp = self.session.get(url=url, params=payload, allow_redirects=False)
if resp.status_code == requests.codes.OK:
return True
except Exception as e:
logger.error("验证cookies是否有效发生异常", e)
return False

def _get_login_page(self):
"""
获取PC端登录页面
阻塞,更新cookies
:return:
"""
url = "https://passport.jd.com/new/login.aspx"
page = self.session.get(url, headers=self.spider_session.get_headers())
return page

def _get_qrcode(self):
"""
缓存并展示登录二维码
:return:
"""
url = 'https://qr.m.jd.com/show'
payload = {
'appid': 133,
'size': 147,
't': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/new/login.aspx',
}
resp = self.session.get(url=url, headers=headers, params=payload)

if not response_status(resp):
logger.info('获取二维码失败')
return False

save_image(resp, self.qrcode_img_file)
logger.info('二维码获取成功,请打开京东APP扫描')
open_image(self.qrcode_img_file)
return True

def _get_qrcode_ticket(self):
"""
通过 token 获取票据 ticket
:return:
"""
url = 'https://qr.m.jd.com/check'
payload = {
'appid': '133',
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'token': self.session.cookies.get('wlfstk_smdl'), # 从cookies获取值
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/new/login.aspx',
}
resp = self.session.get(url=url, headers=headers, params=payload)

if not response_status(resp):
logger.error('获取二维码扫描结果异常')
return False

resp_json = parse_json(resp.text)
if resp_json['code'] != 200:
logger.info('Code: %s, Message: %s', resp_json['code'], resp_json['msg'])
return None
else:
logger.info('已完成手机客户端确认')
return resp_json['ticket']

def _validate_qrcode_ticket(self, ticket):
"""
通过已获取的票据进行校验
:param ticket: 已获取的票据
:return:
"""
url = 'https://passport.jd.com/uc/qrCodeTicketValidation'
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/uc/login?ltype=logout',
}

resp = self.session.get(url=url, headers=headers, params={'t': ticket})
if not response_status(resp):
return False

resp_json = json.loads(resp.text)
if resp_json['returnCode'] == 0:
return True
else:
logger.info(resp_json)
return False

def login_by_qrcode(self):
"""
二维码登陆
:return:
"""
self._get_login_page() # 更新cookies

# download QR code
if not self._get_qrcode():
raise SKException('二维码下载失败')

# get QR code ticket
ticket = None
retry_times = 85
for _ in range(retry_times):
# 重试 拿到ticket
ticket = self._get_qrcode_ticket()
if ticket:
break
time.sleep(2)
else:
raise SKException('二维码过期,请重新获取扫描')

# validate QR code ticket
if not self._validate_qrcode_ticket(ticket):
raise SKException('二维码信息校验失败')

self.refresh_login_status()

logger.info('二维码登录成功')


class ProdectPurchase(object):
def __init__(self):
self.spider_session = SpiderSession()
self.spider_session.load_cookies_from_local()
# 共享一个session

self.qrlogin = QrLogin(self.spider_session)

# 初始化信息
self.sku_id = global_config.getRaw('config', 'sku_id')
self.seckill_num = global_config.getRaw('config', 'seckill_num')
self.work_count = global_config.getRaw('config','process_num')
self.seckill_init_info = dict()
self.seckill_url = dict()
self.seckill_order_data = dict()
self.timers = Timer()

self.session = self.spider_session.get_session()
self.user_agent = self.spider_session.user_agent
self.nick_name = None

def login_by_qrcode(self):
"""
二维码登陆
:return:
"""
if self.qrlogin.is_login:
logger.info('登录成功')
return

self.qrlogin.login_by_qrcode()

if self.qrlogin.is_login:
self.nick_name = self.get_username()
self.spider_session.save_cookies_to_local(self.nick_name)
else:
raise SKException("二维码登录失败!")

def check_login(func):
"""
用户登陆态校验装饰器。若用户未登陆,则调用扫码登陆
"""

@functools.wraps(func)
def new_func(self, *args, **kwargs):
if not self.qrlogin.is_login:
logger.info("{0} 需登陆后调用,开始扫码登陆".format(func.__name__))
self.login_by_qrcode()
return func(self, *args, **kwargs)

return new_func

@check_login
def reserve(self):
"""
预约
"""
self._reserve()

@check_login
def seckill(self):
"""
抢购
"""
self._seckill()

@check_login
def seckill_by_proc_pool(self):
"""
多进程进行抢购
work_count:进程数量
"""
with ProcessPoolExecutor() as pool:
for i in range(self.work_count):
pool.submit(self.seckill)

def _reserve(self):
"""
预约
"""
while True:
try:
self.make_reserve()
break
except Exception as e:
logger.info('预约发生异常!', e)
wait_some_time()

def _seckill(self):
"""
抢购
"""
while True:
try:
self.request_seckill_url()
while True:
self.request_seckill_checkout_page()
self.submit_seckill_order()
except Exception as e:
logger.info('抢购发生异常,稍后继续执行!', e)
wait_some_time()

def make_reserve(self):
"""商品预约"""
logger.info('商品名称:{}'.format(self.get_sku_title()))
url = 'https://yushou.jd.com/youshouinfo.action?'
payload = {
'callback': 'fetchJSON',
'sku': self.sku_id,
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
resp = self.session.get(url=url, params=payload, headers=headers)
resp_json = parse_json(resp.text)
reserve_url = resp_json.get('url')
# self.timers.start()
while True:
try:
self.session.get(url='https:' + reserve_url)
logger.info('预约成功,已获得抢购资格 / 您已成功预约过了,无需重复预约')
if global_config.getRaw('messenger', 'enable') == 'true':
success_message = "预约成功,已获得抢购资格 / 您已成功预约过了,无需重复预约"
send_wechat(success_message)
break
except Exception as e:
logger.error('预约失败正在重试...')

def get_username(self):
"""获取用户信息"""
url = 'https://passport.jd.com/user/petName/getUserInfoForMiniJd.action'
payload = {
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Referer': 'https://order.jd.com/center/list.action',
}

resp = self.session.get(url=url, params=payload, headers=headers)

try_count = 5
while not resp.text.startswith("jQuery"):
try_count = try_count - 1
if try_count > 0:
resp = self.session.get(url=url, params=payload, headers=headers)
else:
break
wait_some_time()
# 响应中包含了许多用户信息,现在在其中返回昵称
# jQuery2381773({"imgUrl":"//storage.360buyimg.com/i.imageUpload/xxx.jpg","lastLoginTime":"","nickName":"xxx","plusStatus":"0","realName":"xxx","userLevel":x,"userScoreVO":{"accountScore":xx,"activityScore":xx,"consumptionScore":xxxxx,"default":false,"financeScore":xxx,"pin":"xxx","riskScore":x,"totalScore":xxxxx}})
return parse_json(resp.text).get('nickName')

def get_sku_title(self):
"""获取商品名称"""
url = 'https://item.jd.com/{}.html'.format(global_config.getRaw('config', 'sku_id'))
resp = self.session.get(url).content
x_data = etree.HTML(resp)
sku_title = x_data.xpath('/html/head/title/text()')
return sku_title[0]

def get_seckill_url(self):
"""获取商品的抢购链接
点击"抢购"按钮后,会有两次302跳转,最后到达订单结算页面
这里返回第一次跳转后的页面url,作为商品的抢购链接
:return: 商品的抢购链接
"""
url = 'https://itemko.jd.com/itemShowBtn'
payload = {
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'skuId': self.sku_id,
'from': 'pc',
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Host': 'itemko.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
while True:
resp = self.session.get(url=url, headers=headers, params=payload)
resp_json = parse_json(resp.text)
if resp_json.get('url'):
# https://divide.jd.com/user_rou ... %3Dpc
router_url = 'https:' + resp_json.get('url')
# https://marathon.jd.com/captch ... %3Dpc
seckill_url = router_url.replace(
'divide', 'marathon').replace(
'user_routing', 'captcha.html')
logger.info("抢购链接获取成功: %s", seckill_url)
return seckill_url
else:
logger.info("抢购链接获取失败,稍后自动重试")
wait_some_time()

def request_seckill_url(self):
"""访问商品的抢购链接(用于设置cookie等"""
logger.info('用户:{}'.format(self.get_username()))
logger.info('商品名称:{}'.format(self.get_sku_title()))
self.timers.start() # 阻塞

self.seckill_url[self.sku_id] = self.get_seckill_url()
logger.info('访问商品的抢购连接...')
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
self.session.get(
url=self.seckill_url.get(
self.sku_id),
headers=headers,
allow_redirects=False)

def request_seckill_checkout_page(self):
"""访问抢购订单结算页面"""
logger.info('访问抢购订单结算页面...')
url = 'https://marathon.jd.com/seckill/seckill.action'
payload = {
'skuId': self.sku_id,
'num': self.seckill_num,
'rid': int(time.time())
}
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
self.session.get(url=url, params=payload, headers=headers, allow_redirects=False)

def _get_seckill_init_info(self):
"""获取秒杀初始化信息(包括:地址,发票,token)
:return: 初始化信息组成的dict
"""
logger.info('获取秒杀初始化信息...')
url = 'https://marathon.jd.com/seckillnew/orderService/pc/init.action'
data = {
'sku': self.sku_id,
'num': self.seckill_num,
'isModifyAddress': 'false',
}
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
}
resp = self.session.post(url=url, data=data, headers=headers)

resp_json = None
try:
resp_json = parse_json(resp.text)
except Exception:
raise SKException('抢购失败,返回信息:{}'.format(resp.text[0: 128]))

return resp_json

def _get_seckill_order_data(self):
"""生成提交抢购订单所需的请求体参数
:return: 请求体参数组成的dict
"""
logger.info('生成提交抢购订单所需参数...')
# 获取用户秒杀初始化信息
self.seckill_init_info[self.sku_id] = self._get_seckill_init_info()
init_info = self.seckill_init_info.get(self.sku_id)
default_address = init_info['addressList'][0] # 默认地址dict
invoice_info = init_info.get('invoiceInfo', {}) # 默认发票信息dict, 有可能不返回
token = init_info['token']
data = {
'skuId': self.sku_id,
'num': self.seckill_num,
'addressId': default_address['id'],
'yuShou': 'true',
'isModifyAddress': 'false',
'name': default_address['name'],
'provinceId': default_address['provinceId'],
'cityId': default_address['cityId'],
'countyId': default_address['countyId'],
'townId': default_address['townId'],
'addressDetail': default_address['addressDetail'],
'mobile': default_address['mobile'],
'mobileKey': default_address['mobileKey'],
'email': default_address.get('email', ''),
'postCode': '',
'invoiceTitle': invoice_info.get('invoiceTitle', -1),
'invoiceCompanyName': '',
'invoiceContent': invoice_info.get('invoiceContentType', 1),
'invoiceTaxpayerNO': '',
'invoiceEmail': '',
'invoicePhone': invoice_info.get('invoicePhone', ''),
'invoicePhoneKey': invoice_info.get('invoicePhoneKey', ''),
'invoice': 'true' if invoice_info else 'false',
'password': global_config.get('account', 'payment_pwd'),
'codTimeType': 3,
'paymentType': 4,
'areaCode': '',
'overseas': 0,
'phone': '',
'eid': global_config.getRaw('config', 'eid'),
'fp': global_config.getRaw('config', 'fp'),
'token': token,
'pru': ''
}

return data

def submit_seckill_order(self):
"""提交抢购(秒杀)订单
:return: 抢购结果 True/False
"""
url = 'https://marathon.jd.com/seckillnew/orderService/pc/submitOrder.action'
payload = {
'skuId': self.sku_id,
}
try:
self.seckill_order_data[self.sku_id] = self._get_seckill_order_data()
except Exception as e:
logger.info('抢购失败,无法获取生成订单的基本信息,接口返回:【{}】'.format(str(e)))
return False

logger.info('提交抢购订单...')
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://marathon.jd.com/seckill/seckill.action?skuId={0}&num={1}&rid={2}'.format(
self.sku_id, self.seckill_num, int(time.time())),
}
resp = self.session.post(
url=url,
params=payload,
data=self.seckill_order_data.get(
self.sku_id),
headers=headers)
resp_json = None
try:
resp_json = parse_json(resp.text)
except Exception as e:
logger.info('抢购失败,返回信息:{}'.format(resp.text[0: 128]))
return False
# 返回信息
# 抢购失败:
# {'errorMessage': '很遗憾没有抢到,再接再厉哦。', 'orderId': 0, 'resultCode': 60074, 'skuId': 0, 'success': False}
# {'errorMessage': '抱歉,您提交过快,请稍后再提交订单!', 'orderId': 0, 'resultCode': 60017, 'skuId': 0, 'success': False}
# {'errorMessage': '系统正在开小差,请重试~~', 'orderId': 0, 'resultCode': 90013, 'skuId': 0, 'success': False}
# 抢购成功:
# {"appUrl":"xxxxx","orderId":820227xxxxx,"pcUrl":"xxxxx","resultCode":0,"skuId":0,"success":true,"totalMoney":"xxxxx"}
if resp_json.get('success'):
order_id = resp_json.get('orderId')
total_money = resp_json.get('totalMoney')
pay_url = 'https:' + resp_json.get('pcUrl')
logger.info('抢购成功,订单号:{}, 总价:{}, 电脑端付款链接:{}'.format(order_id, total_money, pay_url))
if global_config.getRaw('messenger', 'enable') == 'true':
success_message = "抢购成功,订单号:{}, 总价:{}, 电脑端付款链接:{}".format(order_id, total_money, pay_url)
send_wechat(success_message)
return True
else:
logger.info('抢购失败,返回信息:{}'.format(resp_json))
if global_config.getRaw('messenger', 'enable') == 'true':
error_message = '抢购失败,返回信息:{}'.format(resp_json)
send_wechat(error_message)
return False





 
苏宁脚本目前在测试途中,需要继续调试。
原创文章,
转载请注明:http://30daydo.com/article/44129 
欢迎关注公众号:
可转债量化分析


  查看全部
最近掀起了茅台抢购风,所以分享一个python抢购脚本。
运行环境 windows,linux,mac,python3+
 
京东小白分查询:
https://plus.m.jd.com/rights/windControl
分太低的就不要参与了,毕竟概率会小很多
 
############ 2021-01-13 更新 ======
最新的用Go重写的,搞了几瓶

微信图片_20210113104908.jpg


photo_2021-01-11_10-07-41.jpg

 
苏宁家的:

photo_2021-01-13_10-51-53.jpg

 


============= 2021-01-11 更新 ============


感觉苏宁的抢购是耍猴的,那个按钮基本处于不可点状态,所以就放弃了,感觉官方就是没放多少量,加上苏宁公司过往的尿性,所以洗洗睡了 



main.py
import sys

from maotai.jd_spider_requests import ProdectPurchase


if __name__ == '__main__':
tip = """
功能列表:
1.预约商品
2.秒杀抢购商品
"""
print(tip)

product = ProdectPurchase()
choice_function = input('请选择:')
if choice_function == '1':
product.reserve()
elif choice_function == '2':
product.seckill_by_proc_pool()
else:
print('没有此功能')
sys.exit(1)







jd_spider_requests.py
import random
import time
import requests
import functools
import json
import os
import pickle

from lxml import etree

from error.exception import SKException
from maotai.jd_logger import logger
from maotai.timer import Timer
from maotai.config import global_config
from concurrent.futures import ProcessPoolExecutor
from helper.jd_helper import (
parse_json,
send_wechat,
wait_some_time,
response_status,
save_image,
open_image
)


class SpiderSession:
"""
Session相关操作
"""

def __init__(self):
self.cookies_dir_path = "./cookies/"
self.user_agent = global_config.getRaw('config', 'DEFAULT_USER_AGENT')

self.session = self._init_session()

def _init_session(self):
session = requests.session()
session.headers = self.get_headers()
return session

def get_headers(self):
return {"User-Agent": self.user_agent,
"Accept": "text/html,application/xhtml+xml,application/xml;"
"q=0.9,image/webp,image/apng,*/*;"
"q=0.8,application/signed-exchange;"
"v=b3",
"Connection": "keep-alive"}

def get_user_agent(self):
return self.user_agent

def get_session(self):
"""
获取当前Session
:return:
"""
return self.session

def get_cookies(self):
"""
获取当前Cookies
:return:
"""
return self.get_session().cookies

def set_cookies(self, cookies):
self.session.cookies.update(cookies)

def load_cookies_from_local(self):
"""
从本地加载Cookie
:return:
"""
cookies_file = ''
if not os.path.exists(self.cookies_dir_path):
return False
for name in os.listdir(self.cookies_dir_path):
if name.endswith(".cookies"):
cookies_file = '{}{}'.format(self.cookies_dir_path, name)
break
if cookies_file == '':
return False
with open(cookies_file, 'rb') as f:
local_cookies = pickle.load(f)
self.set_cookies(local_cookies)

def save_cookies_to_local(self, cookie_file_name):
"""
保存Cookie到本地
:param cookie_file_name: 存放Cookie的文件名称
:return:
"""
cookies_file = '{}{}.cookies'.format(self.cookies_dir_path, cookie_file_name)
directory = os.path.dirname(cookies_file)
if not os.path.exists(directory):
os.makedirs(directory)
with open(cookies_file, 'wb') as f:
pickle.dump(self.get_cookies(), f)


class QrLogin:
"""
扫码登录
"""

def __init__(self, spider_session: SpiderSession):
"""
初始化扫码登录
大致流程:
1、访问登录二维码页面,获取Token
2、使用Token获取票据
3、校验票据
:param spider_session:
"""
self.qrcode_img_file = 'qr_code.png'

self.spider_session = spider_session
self.session = self.spider_session.get_session()

self.is_login = False
self.refresh_login_status()

def refresh_login_status(self):
"""
刷新是否登录状态
:return:
"""
self.is_login = self._validate_cookies()

def _validate_cookies(self):
"""
验证cookies是否有效(是否登陆)
通过访问用户订单列表页进行判断:若未登录,将会重定向到登陆页面。
:return: cookies是否有效 True/False
"""
url = 'https://order.jd.com/center/list.action'
payload = {
'rid': str(int(time.time() * 1000)),
}
try:
resp = self.session.get(url=url, params=payload, allow_redirects=False)
if resp.status_code == requests.codes.OK:
return True
except Exception as e:
logger.error("验证cookies是否有效发生异常", e)
return False

def _get_login_page(self):
"""
获取PC端登录页面
阻塞,更新cookies
:return:
"""
url = "https://passport.jd.com/new/login.aspx"
page = self.session.get(url, headers=self.spider_session.get_headers())
return page

def _get_qrcode(self):
"""
缓存并展示登录二维码
:return:
"""
url = 'https://qr.m.jd.com/show'
payload = {
'appid': 133,
'size': 147,
't': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/new/login.aspx',
}
resp = self.session.get(url=url, headers=headers, params=payload)

if not response_status(resp):
logger.info('获取二维码失败')
return False

save_image(resp, self.qrcode_img_file)
logger.info('二维码获取成功,请打开京东APP扫描')
open_image(self.qrcode_img_file)
return True

def _get_qrcode_ticket(self):
"""
通过 token 获取票据 ticket
:return:
"""
url = 'https://qr.m.jd.com/check'
payload = {
'appid': '133',
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'token': self.session.cookies.get('wlfstk_smdl'), # 从cookies获取值
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/new/login.aspx',
}
resp = self.session.get(url=url, headers=headers, params=payload)

if not response_status(resp):
logger.error('获取二维码扫描结果异常')
return False

resp_json = parse_json(resp.text)
if resp_json['code'] != 200:
logger.info('Code: %s, Message: %s', resp_json['code'], resp_json['msg'])
return None
else:
logger.info('已完成手机客户端确认')
return resp_json['ticket']

def _validate_qrcode_ticket(self, ticket):
"""
通过已获取的票据进行校验
:param ticket: 已获取的票据
:return:
"""
url = 'https://passport.jd.com/uc/qrCodeTicketValidation'
headers = {
'User-Agent': self.spider_session.get_user_agent(),
'Referer': 'https://passport.jd.com/uc/login?ltype=logout',
}

resp = self.session.get(url=url, headers=headers, params={'t': ticket})
if not response_status(resp):
return False

resp_json = json.loads(resp.text)
if resp_json['returnCode'] == 0:
return True
else:
logger.info(resp_json)
return False

def login_by_qrcode(self):
"""
二维码登陆
:return:
"""
self._get_login_page() # 更新cookies

# download QR code
if not self._get_qrcode():
raise SKException('二维码下载失败')

# get QR code ticket
ticket = None
retry_times = 85
for _ in range(retry_times):
# 重试 拿到ticket
ticket = self._get_qrcode_ticket()
if ticket:
break
time.sleep(2)
else:
raise SKException('二维码过期,请重新获取扫描')

# validate QR code ticket
if not self._validate_qrcode_ticket(ticket):
raise SKException('二维码信息校验失败')

self.refresh_login_status()

logger.info('二维码登录成功')


class ProdectPurchase(object):
def __init__(self):
self.spider_session = SpiderSession()
self.spider_session.load_cookies_from_local()
# 共享一个session

self.qrlogin = QrLogin(self.spider_session)

# 初始化信息
self.sku_id = global_config.getRaw('config', 'sku_id')
self.seckill_num = global_config.getRaw('config', 'seckill_num')
self.work_count = global_config.getRaw('config','process_num')
self.seckill_init_info = dict()
self.seckill_url = dict()
self.seckill_order_data = dict()
self.timers = Timer()

self.session = self.spider_session.get_session()
self.user_agent = self.spider_session.user_agent
self.nick_name = None

def login_by_qrcode(self):
"""
二维码登陆
:return:
"""
if self.qrlogin.is_login:
logger.info('登录成功')
return

self.qrlogin.login_by_qrcode()

if self.qrlogin.is_login:
self.nick_name = self.get_username()
self.spider_session.save_cookies_to_local(self.nick_name)
else:
raise SKException("二维码登录失败!")

def check_login(func):
"""
用户登陆态校验装饰器。若用户未登陆,则调用扫码登陆
"""

@functools.wraps(func)
def new_func(self, *args, **kwargs):
if not self.qrlogin.is_login:
logger.info("{0} 需登陆后调用,开始扫码登陆".format(func.__name__))
self.login_by_qrcode()
return func(self, *args, **kwargs)

return new_func

@check_login
def reserve(self):
"""
预约
"""
self._reserve()

@check_login
def seckill(self):
"""
抢购
"""
self._seckill()

@check_login
def seckill_by_proc_pool(self):
"""
多进程进行抢购
work_count:进程数量
"""
with ProcessPoolExecutor() as pool:
for i in range(self.work_count):
pool.submit(self.seckill)

def _reserve(self):
"""
预约
"""
while True:
try:
self.make_reserve()
break
except Exception as e:
logger.info('预约发生异常!', e)
wait_some_time()

def _seckill(self):
"""
抢购
"""
while True:
try:
self.request_seckill_url()
while True:
self.request_seckill_checkout_page()
self.submit_seckill_order()
except Exception as e:
logger.info('抢购发生异常,稍后继续执行!', e)
wait_some_time()

def make_reserve(self):
"""商品预约"""
logger.info('商品名称:{}'.format(self.get_sku_title()))
url = 'https://yushou.jd.com/youshouinfo.action?'
payload = {
'callback': 'fetchJSON',
'sku': self.sku_id,
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
resp = self.session.get(url=url, params=payload, headers=headers)
resp_json = parse_json(resp.text)
reserve_url = resp_json.get('url')
# self.timers.start()
while True:
try:
self.session.get(url='https:' + reserve_url)
logger.info('预约成功,已获得抢购资格 / 您已成功预约过了,无需重复预约')
if global_config.getRaw('messenger', 'enable') == 'true':
success_message = "预约成功,已获得抢购资格 / 您已成功预约过了,无需重复预约"
send_wechat(success_message)
break
except Exception as e:
logger.error('预约失败正在重试...')

def get_username(self):
"""获取用户信息"""
url = 'https://passport.jd.com/user/petName/getUserInfoForMiniJd.action'
payload = {
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Referer': 'https://order.jd.com/center/list.action',
}

resp = self.session.get(url=url, params=payload, headers=headers)

try_count = 5
while not resp.text.startswith("jQuery"):
try_count = try_count - 1
if try_count > 0:
resp = self.session.get(url=url, params=payload, headers=headers)
else:
break
wait_some_time()
# 响应中包含了许多用户信息,现在在其中返回昵称
# jQuery2381773({"imgUrl":"//storage.360buyimg.com/i.imageUpload/xxx.jpg","lastLoginTime":"","nickName":"xxx","plusStatus":"0","realName":"xxx","userLevel":x,"userScoreVO":{"accountScore":xx,"activityScore":xx,"consumptionScore":xxxxx,"default":false,"financeScore":xxx,"pin":"xxx","riskScore":x,"totalScore":xxxxx}})
return parse_json(resp.text).get('nickName')

def get_sku_title(self):
"""获取商品名称"""
url = 'https://item.jd.com/{}.html'.format(global_config.getRaw('config', 'sku_id'))
resp = self.session.get(url).content
x_data = etree.HTML(resp)
sku_title = x_data.xpath('/html/head/title/text()')
return sku_title[0]

def get_seckill_url(self):
"""获取商品的抢购链接
点击"抢购"按钮后,会有两次302跳转,最后到达订单结算页面
这里返回第一次跳转后的页面url,作为商品的抢购链接
:return: 商品的抢购链接
"""
url = 'https://itemko.jd.com/itemShowBtn'
payload = {
'callback': 'jQuery{}'.format(random.randint(1000000, 9999999)),
'skuId': self.sku_id,
'from': 'pc',
'_': str(int(time.time() * 1000)),
}
headers = {
'User-Agent': self.user_agent,
'Host': 'itemko.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
while True:
resp = self.session.get(url=url, headers=headers, params=payload)
resp_json = parse_json(resp.text)
if resp_json.get('url'):
# https://divide.jd.com/user_rou ... %3Dpc
router_url = 'https:' + resp_json.get('url')
# https://marathon.jd.com/captch ... %3Dpc
seckill_url = router_url.replace(
'divide', 'marathon').replace(
'user_routing', 'captcha.html')
logger.info("抢购链接获取成功: %s", seckill_url)
return seckill_url
else:
logger.info("抢购链接获取失败,稍后自动重试")
wait_some_time()

def request_seckill_url(self):
"""访问商品的抢购链接(用于设置cookie等"""
logger.info('用户:{}'.format(self.get_username()))
logger.info('商品名称:{}'.format(self.get_sku_title()))
self.timers.start() # 阻塞

self.seckill_url[self.sku_id] = self.get_seckill_url()
logger.info('访问商品的抢购连接...')
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
self.session.get(
url=self.seckill_url.get(
self.sku_id),
headers=headers,
allow_redirects=False)

def request_seckill_checkout_page(self):
"""访问抢购订单结算页面"""
logger.info('访问抢购订单结算页面...')
url = 'https://marathon.jd.com/seckill/seckill.action'
payload = {
'skuId': self.sku_id,
'num': self.seckill_num,
'rid': int(time.time())
}
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://item.jd.com/{}.html'.format(self.sku_id),
}
self.session.get(url=url, params=payload, headers=headers, allow_redirects=False)

def _get_seckill_init_info(self):
"""获取秒杀初始化信息(包括:地址,发票,token)
:return: 初始化信息组成的dict
"""
logger.info('获取秒杀初始化信息...')
url = 'https://marathon.jd.com/seckillnew/orderService/pc/init.action'
data = {
'sku': self.sku_id,
'num': self.seckill_num,
'isModifyAddress': 'false',
}
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
}
resp = self.session.post(url=url, data=data, headers=headers)

resp_json = None
try:
resp_json = parse_json(resp.text)
except Exception:
raise SKException('抢购失败,返回信息:{}'.format(resp.text[0: 128]))

return resp_json

def _get_seckill_order_data(self):
"""生成提交抢购订单所需的请求体参数
:return: 请求体参数组成的dict
"""
logger.info('生成提交抢购订单所需参数...')
# 获取用户秒杀初始化信息
self.seckill_init_info[self.sku_id] = self._get_seckill_init_info()
init_info = self.seckill_init_info.get(self.sku_id)
default_address = init_info['addressList'][0] # 默认地址dict
invoice_info = init_info.get('invoiceInfo', {}) # 默认发票信息dict, 有可能不返回
token = init_info['token']
data = {
'skuId': self.sku_id,
'num': self.seckill_num,
'addressId': default_address['id'],
'yuShou': 'true',
'isModifyAddress': 'false',
'name': default_address['name'],
'provinceId': default_address['provinceId'],
'cityId': default_address['cityId'],
'countyId': default_address['countyId'],
'townId': default_address['townId'],
'addressDetail': default_address['addressDetail'],
'mobile': default_address['mobile'],
'mobileKey': default_address['mobileKey'],
'email': default_address.get('email', ''),
'postCode': '',
'invoiceTitle': invoice_info.get('invoiceTitle', -1),
'invoiceCompanyName': '',
'invoiceContent': invoice_info.get('invoiceContentType', 1),
'invoiceTaxpayerNO': '',
'invoiceEmail': '',
'invoicePhone': invoice_info.get('invoicePhone', ''),
'invoicePhoneKey': invoice_info.get('invoicePhoneKey', ''),
'invoice': 'true' if invoice_info else 'false',
'password': global_config.get('account', 'payment_pwd'),
'codTimeType': 3,
'paymentType': 4,
'areaCode': '',
'overseas': 0,
'phone': '',
'eid': global_config.getRaw('config', 'eid'),
'fp': global_config.getRaw('config', 'fp'),
'token': token,
'pru': ''
}

return data

def submit_seckill_order(self):
"""提交抢购(秒杀)订单
:return: 抢购结果 True/False
"""
url = 'https://marathon.jd.com/seckillnew/orderService/pc/submitOrder.action'
payload = {
'skuId': self.sku_id,
}
try:
self.seckill_order_data[self.sku_id] = self._get_seckill_order_data()
except Exception as e:
logger.info('抢购失败,无法获取生成订单的基本信息,接口返回:【{}】'.format(str(e)))
return False

logger.info('提交抢购订单...')
headers = {
'User-Agent': self.user_agent,
'Host': 'marathon.jd.com',
'Referer': 'https://marathon.jd.com/seckill/seckill.action?skuId={0}&num={1}&rid={2}'.format(
self.sku_id, self.seckill_num, int(time.time())),
}
resp = self.session.post(
url=url,
params=payload,
data=self.seckill_order_data.get(
self.sku_id),
headers=headers)
resp_json = None
try:
resp_json = parse_json(resp.text)
except Exception as e:
logger.info('抢购失败,返回信息:{}'.format(resp.text[0: 128]))
return False
# 返回信息
# 抢购失败:
# {'errorMessage': '很遗憾没有抢到,再接再厉哦。', 'orderId': 0, 'resultCode': 60074, 'skuId': 0, 'success': False}
# {'errorMessage': '抱歉,您提交过快,请稍后再提交订单!', 'orderId': 0, 'resultCode': 60017, 'skuId': 0, 'success': False}
# {'errorMessage': '系统正在开小差,请重试~~', 'orderId': 0, 'resultCode': 90013, 'skuId': 0, 'success': False}
# 抢购成功:
# {"appUrl":"xxxxx","orderId":820227xxxxx,"pcUrl":"xxxxx","resultCode":0,"skuId":0,"success":true,"totalMoney":"xxxxx"}
if resp_json.get('success'):
order_id = resp_json.get('orderId')
total_money = resp_json.get('totalMoney')
pay_url = 'https:' + resp_json.get('pcUrl')
logger.info('抢购成功,订单号:{}, 总价:{}, 电脑端付款链接:{}'.format(order_id, total_money, pay_url))
if global_config.getRaw('messenger', 'enable') == 'true':
success_message = "抢购成功,订单号:{}, 总价:{}, 电脑端付款链接:{}".format(order_id, total_money, pay_url)
send_wechat(success_message)
return True
else:
logger.info('抢购失败,返回信息:{}'.format(resp_json))
if global_config.getRaw('messenger', 'enable') == 'true':
error_message = '抢购失败,返回信息:{}'.format(resp_json)
send_wechat(error_message)
return False





 
苏宁脚本目前在测试途中,需要继续调试。
原创文章,
转载请注明:http://30daydo.com/article/44129 
欢迎关注公众号:
可转债量化分析


 

python函数调用后面可以有一个空格

李魔佛 发表了文章 • 0 个评论 • 743 次浏览 • 2020-12-13 11:13 • 来自相关话题

没想到居然可以这样。
print ('hello')
hello
def sayhi():
...: print('Done')
...:
sayhi () # 这里有一个空格
Done

不过如果平时这么写,会被人打的 查看全部
没想到居然可以这样。
print ('hello')
hello
def sayhi():
...: print('Done')
...:
sayhi () # 这里有一个空格
Done

不过如果平时这么写,会被人打的

导出python自带关键字 keyword

李魔佛 发表了文章 • 0 个评论 • 627 次浏览 • 2020-12-13 10:57 • 来自相关话题

居然还自带这个库
import keyword
keyword.kwlist
Out[3]:
['False',
'None',
'True',
'and',
'as',
'assert',
'async',
'await',
'break',
'class',
'continue',
'def',
'del',
'elif',
'else',
'except',
'finally',
'for',
'from',
'global',
'if',
'import',
'in',
'is',
'lambda',
'nonlocal',
'not',
'or',
'pass',
'raise',
'return',
'try',
'while',
'with',
'yield']
len(keyword.kwlist)
Out[4]: 35 查看全部
居然还自带这个库
import keyword
keyword.kwlist
Out[3]:
['False',
'None',
'True',
'and',
'as',
'assert',
'async',
'await',
'break',
'class',
'continue',
'def',
'del',
'elif',
'else',
'except',
'finally',
'for',
'from',
'global',
'if',
'import',
'in',
'is',
'lambda',
'nonlocal',
'not',
'or',
'pass',
'raise',
'return',
'try',
'while',
'with',
'yield']
len(keyword.kwlist)
Out[4]: 35

微信公众号后台的签名校验的官方教程在python3下不兼容

李魔佛 发表了文章 • 0 个评论 • 670 次浏览 • 2020-12-11 11:43 • 来自相关话题

感觉写这个文档的人是个菜鸡。 
首先文档用的python2代码写的,但文中没有标明。
 
 
python2旧就算了,而且那么多框架不用,还要用一个老掉牙的web.py来写,也是醉了。
 
django下的签名校验:token = '123456789'
def Services(request):
print(request.method)
if request.method=='GET':

signature = request.GET.get('signature')
echostr = request.GET.get('echostr')
timestamp = request.GET.get('timestamp')
nonce = request.GET.get('nonce')
list_ = [token, timestamp, nonce]
list_.sort()
list_str = ''.join(list_)

sha1 = hashlib.sha1(list_str.encode('utf8'))
hashcode = sha1.hexdigest()
if hashcode==signature:
return HttpResponse(echostr)
else:
return HttpResponse('')
原创文章,转载请注明出处http://30daydo.com/article/44121
 
 
  查看全部
感觉写这个文档的人是个菜鸡。 
首先文档用的python2代码写的,但文中没有标明。
 
 
python2旧就算了,而且那么多框架不用,还要用一个老掉牙的web.py来写,也是醉了。
 
django下的签名校验:
token = '123456789'
def Services(request):
print(request.method)
if request.method=='GET':

signature = request.GET.get('signature')
echostr = request.GET.get('echostr')
timestamp = request.GET.get('timestamp')
nonce = request.GET.get('nonce')
list_ = [token, timestamp, nonce]
list_.sort()
list_str = ''.join(list_)

sha1 = hashlib.sha1(list_str.encode('utf8'))
hashcode = sha1.hexdigest()
if hashcode==signature:
return HttpResponse(echostr)
else:
return HttpResponse('')

原创文章,转载请注明出处http://30daydo.com/article/44121
 
 
 

pycharm 条件断点

李魔佛 发表了文章 • 0 个评论 • 672 次浏览 • 2020-12-01 13:42 • 来自相关话题

如果在一个循环或者需要执行很多次的的递归里面,可以使用条件断点。
 
先在想要停下来的地方打一个断点,然后再点击一下断点,弹出的一个条件断点的窗口,在窗口输入一个条件即可。
 
比如:
def main():
for i in range(100):
print(i*i)


if __name__ == '__main__':
main()
在i*i的地方打一个断点,然后在条件断点那里输入 i=50, 那么在调试模式下只有在i=50的时候才会停下来。
  查看全部
如果在一个循环或者需要执行很多次的的递归里面,可以使用条件断点。
 
先在想要停下来的地方打一个断点,然后再点击一下断点,弹出的一个条件断点的窗口,在窗口输入一个条件即可。
 
比如:
def main():
for i in range(100):
print(i*i)


if __name__ == '__main__':
main()

在i*i的地方打一个断点,然后在条件断点那里输入 i=50, 那么在调试模式下只有在i=50的时候才会停下来。
 

python执行js语句,无函数返回值

李魔佛 发表了文章 • 0 个评论 • 602 次浏览 • 2020-11-30 16:11 • 来自相关话题

有时候在JS代码里面抠出部分语句,但是不是一个函数。
如下面的一段JSvar radra27radra27 = "D";
var ra72419ra91ra72419ra91 = "7.241.9" + ".";
var raurst500ra63raurst500ra63 = "urst=500";
var ravalidtora49ravalidtora49 = "validto=";
var raevphncdra57raevphncdra57 = "ev.ph" + "ncd";
var ra16067161ra17ra16067161ra17 = "16067161";
var radeos202ra16radeos202ra16 = "deos/20" + "2";
var ra1080p4ra73ra1080p4ra73 = "/1080P_4";
var ra209hashra72ra209hashra72 = "209&hash";
var ra2bmkdz7nra36ra2bmkdz7nra36 = "2BMKd" + "z7N";
var ra6708909ra29ra6708909ra29 = "6708909" + "&";
var ra00kip4ra41ra00kip4ra41 = "00k&i" + "p=4";
var ra006163ra73ra006163ra73 = "006/16/" + "3";
var raro7upu3ra66raro7upu3ra66 = "Ro7UPU%3";
var raroiu6qra26raroiu6qra26 = "=rOiU6q%";
var ra075351mra26ra075351mra26 = "075351." + "m";
var ramgdmctbvra11ramgdmctbvra11 = "MgdmCtbV";
var rap4validra25rap4validra25 = "p4?valid";
var ra09ratera79ra09ratera79 = "09&rate=";
var rancomvira35rancomvira35 = "n.com/vi";
var ra24075351ra94ra24075351ra94 = "24075" + "351";
var ra000k324ra70ra000k324ra70 = "000K_324";
var ra50000kbra49ra50000kbra49 = "50000k&" + "b";
var rahttpsra83rahttpsra83 = "https://";
var rafrom160ra56rafrom160ra56 = "from=16" + "0";
var quality_1080p =/* + radra27radra27 + */rahttpsra83rahttpsra83 + /* + rancomvira35rancomvira35 + */raevphncdra57raevphncdra57 + /* + radra27radra27 + */rancomvira35rancomvira35 + /* + ra006163ra73ra006163ra73 + */radeos202ra16radeos202ra16 + /* + ra09ratera79ra09ratera79 + */ra006163ra73ra006163ra73 + /* + ra1080p4ra73ra1080p4ra73 + */ra24075351ra94ra24075351ra94 + /* + raroiu6qra26raroiu6qra26 + */ra1080p4ra73ra1080p4ra73 + /* + ra000k324ra70ra000k324ra70 + */ra000k324ra70ra000k324ra70 + /* + rancomvira35rancomvira35 + */ra075351mra26ra075351mra26 + /* + ravalidtora49ravalidtora49 + */rap4validra25rap4validra25 + /* + ra209hashra72ra209hashra72 + */rafrom160ra56rafrom160ra56 + /* + ra1080p4ra73ra1080p4ra73 + */ra6708909ra29ra6708909ra29 + /* + ra209hashra72ra209hashra72 + */ravalidtora49ravalidtora49 + /* + ramgdmctbvra11ramgdmctbvra11 + */ra16067161ra17ra16067161ra17 + /* + ra24075351ra94ra24075351ra94 + */ra09ratera79ra09ratera79 + /* + ra50000kbra49ra50000kbra49 + */ra50000kbra49ra50000kbra49 + /* + ramgdmctbvra11ramgdmctbvra11 + */raurst500ra63raurst500ra63 + /* + ra209hashra72ra209hashra72 + */ra00kip4ra41ra00kip4ra41 + /* + raroiu6qra26raroiu6qra26 + */ra72419ra91ra72419ra91 + /* + ra09ratera79ra09ratera79 + */ra209hashra72ra209hashra72 + /* + raro7upu3ra66raro7upu3ra66 + */raroiu6qra26raroiu6qra26 + /* + ra075351mra26ra075351mra26 + */ra2bmkdz7nra36ra2bmkdz7nra36 + /* + ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;
做了很多的运算,掩人耳目,虽然看多几下,用python写也简单,或者把它放入一个function里面也可以。
比如实时网络上下载获取一段js,然后再在头部和尾部组装为function。
function getValue(){
xxxxxx
xxxxxx
return flashvars_324075351
}
 
然后执行这一段JS,call返回函数 getValue() 就可以拿到返回值了。
 
不过今天我们用其他的方法直接获取flashvars_324075351
 
使用jsp库即可。
 
把上面的JS语句var radra27radra27 = "D";
var ra72419ra91ra72419ra91 = "7.241.9" + ".";
var raurst500ra63raurst500ra63 = "urst=500";
var ravalidtora49ravalidtora49 = "validto=";
var raevphncdra57raevphncdra57 = "ev.ph" + "ncd";
var ra16067161ra17ra16067161ra17 = "16067161";
var radeos202ra16radeos202ra16 = "deos/20" + "2";
var ra1080p4ra73ra1080p4ra73 = "/1080P_4";
var ra209hashra72ra209hashra72 = "209&hash";
var ra2bmkdz7nra36ra2bmkdz7nra36 = "2BMKd" + "z7N";
var ra6708909ra29ra6708909ra29 = "6708909" + "&";
var ra00kip4ra41ra00kip4ra41 = "00k&i" + "p=4";
var ra006163ra73ra006163ra73 = "006/16/" + "3";
var raro7upu3ra66raro7upu3ra66 = "Ro7UPU%3";
var raroiu6qra26raroiu6qra26 = "=rOiU6q%";
var ra075351mra26ra075351mra26 = "075351." + "m";
var ramgdmctbvra11ramgdmctbvra11 = "MgdmCtbV";
var rap4validra25rap4validra25 = "p4?valid";
var ra09ratera79ra09ratera79 = "09&rate=";
var rancomvira35rancomvira35 = "n.com/vi";
var ra24075351ra94ra24075351ra94 = "24075" + "351";
var ra000k324ra70ra000k324ra70 = "000K_324";
var ra50000kbra49ra50000kbra49 = "50000k&" + "b";
var rahttpsra83rahttpsra83 = "https://";
var rafrom160ra56rafrom160ra56 = "from=16" + "0";
var quality_1080p =/* + radra27radra27 + */rahttpsra83rahttpsra83 + /* + rancomvira35rancomvira35 + */raevphncdra57raevphncdra57 + /* + radra27radra27 + */rancomvira35rancomvira35 + /* + ra006163ra73ra006163ra73 + */radeos202ra16radeos202ra16 + /* + ra09ratera79ra09ratera79 + */ra006163ra73ra006163ra73 + /* + ra1080p4ra73ra1080p4ra73 + */ra24075351ra94ra24075351ra94 + /* + raroiu6qra26raroiu6qra26 + */ra1080p4ra73ra1080p4ra73 + /* + ra000k324ra70ra000k324ra70 + */ra000k324ra70ra000k324ra70 + /* + rancomvira35rancomvira35 + */ra075351mra26ra075351mra26 + /* + ravalidtora49ravalidtora49 + */rap4validra25rap4validra25 + /* + ra209hashra72ra209hashra72 + */rafrom160ra56rafrom160ra56 + /* + ra1080p4ra73ra1080p4ra73 + */ra6708909ra29ra6708909ra29 + /* + ra209hashra72ra209hashra72 + */ravalidtora49ravalidtora49 + /* + ramgdmctbvra11ramgdmctbvra11 + */ra16067161ra17ra16067161ra17 + /* + ra24075351ra94ra24075351ra94 + */ra09ratera79ra09ratera79 + /* + ra50000kbra49ra50000kbra49 + */ra50000kbra49ra50000kbra49 + /* + ramgdmctbvra11ramgdmctbvra11 + */raurst500ra63raurst500ra63 + /* + ra209hashra72ra209hashra72 + */ra00kip4ra41ra00kip4ra41 + /* + raroiu6qra26raroiu6qra26 + */ra72419ra91ra72419ra91 + /* + ra09ratera79ra09ratera79 + */ra209hashra72ra209hashra72 + /* + raro7upu3ra66raro7upu3ra66 + */raroiu6qra26raroiu6qra26 + /* + ra075351mra26ra075351mra26 + */ra2bmkdz7nra36ra2bmkdz7nra36 + /* + ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;
后面加一个返回值,但不需要加return
比如....
ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;
flashvars_324075351;
然后直接调用jspy
res = js2py.eval_js(js)
 
执行后print(res) , 显示的值就是flashvars_324075351
原创文章,转载请注明出处
http://30daydo.com/article/44112
  查看全部
有时候在JS代码里面抠出部分语句,但是不是一个函数。
如下面的一段JS
var radra27radra27 = "D";
var ra72419ra91ra72419ra91 = "7.241.9" + ".";
var raurst500ra63raurst500ra63 = "urst=500";
var ravalidtora49ravalidtora49 = "validto=";
var raevphncdra57raevphncdra57 = "ev.ph" + "ncd";
var ra16067161ra17ra16067161ra17 = "16067161";
var radeos202ra16radeos202ra16 = "deos/20" + "2";
var ra1080p4ra73ra1080p4ra73 = "/1080P_4";
var ra209hashra72ra209hashra72 = "209&hash";
var ra2bmkdz7nra36ra2bmkdz7nra36 = "2BMKd" + "z7N";
var ra6708909ra29ra6708909ra29 = "6708909" + "&";
var ra00kip4ra41ra00kip4ra41 = "00k&i" + "p=4";
var ra006163ra73ra006163ra73 = "006/16/" + "3";
var raro7upu3ra66raro7upu3ra66 = "Ro7UPU%3";
var raroiu6qra26raroiu6qra26 = "=rOiU6q%";
var ra075351mra26ra075351mra26 = "075351." + "m";
var ramgdmctbvra11ramgdmctbvra11 = "MgdmCtbV";
var rap4validra25rap4validra25 = "p4?valid";
var ra09ratera79ra09ratera79 = "09&rate=";
var rancomvira35rancomvira35 = "n.com/vi";
var ra24075351ra94ra24075351ra94 = "24075" + "351";
var ra000k324ra70ra000k324ra70 = "000K_324";
var ra50000kbra49ra50000kbra49 = "50000k&" + "b";
var rahttpsra83rahttpsra83 = "https://";
var rafrom160ra56rafrom160ra56 = "from=16" + "0";
var quality_1080p =/* + radra27radra27 + */rahttpsra83rahttpsra83 + /* + rancomvira35rancomvira35 + */raevphncdra57raevphncdra57 + /* + radra27radra27 + */rancomvira35rancomvira35 + /* + ra006163ra73ra006163ra73 + */radeos202ra16radeos202ra16 + /* + ra09ratera79ra09ratera79 + */ra006163ra73ra006163ra73 + /* + ra1080p4ra73ra1080p4ra73 + */ra24075351ra94ra24075351ra94 + /* + raroiu6qra26raroiu6qra26 + */ra1080p4ra73ra1080p4ra73 + /* + ra000k324ra70ra000k324ra70 + */ra000k324ra70ra000k324ra70 + /* + rancomvira35rancomvira35 + */ra075351mra26ra075351mra26 + /* + ravalidtora49ravalidtora49 + */rap4validra25rap4validra25 + /* + ra209hashra72ra209hashra72 + */rafrom160ra56rafrom160ra56 + /* + ra1080p4ra73ra1080p4ra73 + */ra6708909ra29ra6708909ra29 + /* + ra209hashra72ra209hashra72 + */ravalidtora49ravalidtora49 + /* + ramgdmctbvra11ramgdmctbvra11 + */ra16067161ra17ra16067161ra17 + /* + ra24075351ra94ra24075351ra94 + */ra09ratera79ra09ratera79 + /* + ra50000kbra49ra50000kbra49 + */ra50000kbra49ra50000kbra49 + /* + ramgdmctbvra11ramgdmctbvra11 + */raurst500ra63raurst500ra63 + /* + ra209hashra72ra209hashra72 + */ra00kip4ra41ra00kip4ra41 + /* + raroiu6qra26raroiu6qra26 + */ra72419ra91ra72419ra91 + /* + ra09ratera79ra09ratera79 + */ra209hashra72ra209hashra72 + /* + raro7upu3ra66raro7upu3ra66 + */raroiu6qra26raroiu6qra26 + /* + ra075351mra26ra075351mra26 + */ra2bmkdz7nra36ra2bmkdz7nra36 + /* + ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;

做了很多的运算,掩人耳目,虽然看多几下,用python写也简单,或者把它放入一个function里面也可以。
比如实时网络上下载获取一段js,然后再在头部和尾部组装为function。
function getValue(){
xxxxxx
xxxxxx
return flashvars_324075351
}
 
然后执行这一段JS,call返回函数 getValue() 就可以拿到返回值了。
 
不过今天我们用其他的方法直接获取flashvars_324075351
 
使用jsp库即可。
 
把上面的JS语句
var radra27radra27 = "D";
var ra72419ra91ra72419ra91 = "7.241.9" + ".";
var raurst500ra63raurst500ra63 = "urst=500";
var ravalidtora49ravalidtora49 = "validto=";
var raevphncdra57raevphncdra57 = "ev.ph" + "ncd";
var ra16067161ra17ra16067161ra17 = "16067161";
var radeos202ra16radeos202ra16 = "deos/20" + "2";
var ra1080p4ra73ra1080p4ra73 = "/1080P_4";
var ra209hashra72ra209hashra72 = "209&hash";
var ra2bmkdz7nra36ra2bmkdz7nra36 = "2BMKd" + "z7N";
var ra6708909ra29ra6708909ra29 = "6708909" + "&";
var ra00kip4ra41ra00kip4ra41 = "00k&i" + "p=4";
var ra006163ra73ra006163ra73 = "006/16/" + "3";
var raro7upu3ra66raro7upu3ra66 = "Ro7UPU%3";
var raroiu6qra26raroiu6qra26 = "=rOiU6q%";
var ra075351mra26ra075351mra26 = "075351." + "m";
var ramgdmctbvra11ramgdmctbvra11 = "MgdmCtbV";
var rap4validra25rap4validra25 = "p4?valid";
var ra09ratera79ra09ratera79 = "09&rate=";
var rancomvira35rancomvira35 = "n.com/vi";
var ra24075351ra94ra24075351ra94 = "24075" + "351";
var ra000k324ra70ra000k324ra70 = "000K_324";
var ra50000kbra49ra50000kbra49 = "50000k&" + "b";
var rahttpsra83rahttpsra83 = "https://";
var rafrom160ra56rafrom160ra56 = "from=16" + "0";
var quality_1080p =/* + radra27radra27 + */rahttpsra83rahttpsra83 + /* + rancomvira35rancomvira35 + */raevphncdra57raevphncdra57 + /* + radra27radra27 + */rancomvira35rancomvira35 + /* + ra006163ra73ra006163ra73 + */radeos202ra16radeos202ra16 + /* + ra09ratera79ra09ratera79 + */ra006163ra73ra006163ra73 + /* + ra1080p4ra73ra1080p4ra73 + */ra24075351ra94ra24075351ra94 + /* + raroiu6qra26raroiu6qra26 + */ra1080p4ra73ra1080p4ra73 + /* + ra000k324ra70ra000k324ra70 + */ra000k324ra70ra000k324ra70 + /* + rancomvira35rancomvira35 + */ra075351mra26ra075351mra26 + /* + ravalidtora49ravalidtora49 + */rap4validra25rap4validra25 + /* + ra209hashra72ra209hashra72 + */rafrom160ra56rafrom160ra56 + /* + ra1080p4ra73ra1080p4ra73 + */ra6708909ra29ra6708909ra29 + /* + ra209hashra72ra209hashra72 + */ravalidtora49ravalidtora49 + /* + ramgdmctbvra11ramgdmctbvra11 + */ra16067161ra17ra16067161ra17 + /* + ra24075351ra94ra24075351ra94 + */ra09ratera79ra09ratera79 + /* + ra50000kbra49ra50000kbra49 + */ra50000kbra49ra50000kbra49 + /* + ramgdmctbvra11ramgdmctbvra11 + */raurst500ra63raurst500ra63 + /* + ra209hashra72ra209hashra72 + */ra00kip4ra41ra00kip4ra41 + /* + raroiu6qra26raroiu6qra26 + */ra72419ra91ra72419ra91 + /* + ra09ratera79ra09ratera79 + */ra209hashra72ra209hashra72 + /* + raro7upu3ra66raro7upu3ra66 + */raroiu6qra26raroiu6qra26 + /* + ra075351mra26ra075351mra26 + */ra2bmkdz7nra36ra2bmkdz7nra36 + /* + ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;

后面加一个返回值,但不需要加return
比如
....
ra50000kbra49ra50000kbra49 + */ramgdmctbvra11ramgdmctbvra11 + /* + radeos202ra16radeos202ra16 + */raro7upu3ra66raro7upu3ra66 + /* + ra075351mra26ra075351mra26 + */radra27radra27;
flashvars_324075351["quality_1080p"] = quality_1080p;
flashvars_324075351;

然后直接调用jspy
res = js2py.eval_js(js)
 
执行后print(res) , 显示的值就是flashvars_324075351
原创文章,转载请注明出处
http://30daydo.com/article/44112