Python: Nivîsar Ji bo Bişkojkên Lêgerîna Nû ya Xwe Trendek Ji Xweser a Google-ê

Python Script Ji Bo Girtina Trendên Xweseriya Xweser

Her kes ji Google Trends hez dike, lê dema ku bêje Peyvên Keys Tail Long ew hinekî tevlihev e. Em hemî ji karbidest hez dikin karûbarê meylên google ji bo têgihîştina tevgera lêgerînê. Lêbelê, du tişt nahêlin gelek kes wê ji bo xebata hişk bikar bînin;

  1. Gava ku hûn hewce ne ku bibînin bêjeyên keyfî yên nûwir li ser Trendên Google-ê daneyên bes nîne 
  2. Nebûna API-ya fermî ji bo daxwazên ji trendên google re: Gava ku em modulên mîna bikar tînin pytrends, wê hingê em neçar in serverat proxy bikar bînin, an jî em têne asteng kirin. 

Di vê gotarê de, ez ê Skrîpta Python a ku me nivîsandiye ji bo hinardekirina bêjeyên sereke yên trendy bi rêya Google Autosuggest parve bikim.

Encamên Otosuggest Bi Demê Bistînin û Bistînin 

Bawer bikin ku me 1,000 peyvên sereke yên Seed hene ku ji Google Autosuggest re têne şandin. Di vegerê de, dibe ku em ê dora 200,000 bigirin dirêj bêjeyên sereke. Wê hingê, pêdivî ye ku em hefteyek paşê eynî tiştî bikin û van daneyan bidin hev ku du pirsan bibersivînin:

  • Kîjan pirs hene bêjeyên sereke yên nû li gorî dema paşîn? Dibe ku ev rewşa ku ji me re hewce dike ev e. Google difikire ku ew pirs girîngtir dibin - bi kirina vê yekê, em dikarin çareseriya xweya Xweser a Google-ê biafirînin! 
  • Kîjan pirs hene bêjeyên sereke êdî trending?

Nivîsar pir hêsan e, û piraniya kodê ku min parve kir vir. Koda nûvekirî daneyên ji rêve borî û berawirdkirina pêşniyaran bi demê re xilas dike. Me dev ji databasên pel-bingeh wekî SQLite berda da ku em wê hêsan bikin - ji ber vê yekê hemî depoya daneyê pelên CSV yên li jêr bikar tîne. Ev dihêle hûn pelê di Excel-ê de têxin hundur û ji bo karsaziya xwe li trendên bêjeya keyword-ê digerin.

Ku Vê Nivîsara Python Bikaranîn

  1. Koma peyva sereke ya tovê ku divê ji xweserkirinê re were şandin binivîse: keywords.csv
  2. Mîhengên Nivîsarê ji bo hewcedariya xwe sererast bikin:
    • ZIMAN: default "en"
    • WELAT: default "me"
  3. Rêznameyê nivîsîn ku heftê carek biçe. Her weha hûn dikarin wê bi destan wekî ku hûn dixwazin bimeşînin.
  4. Ji bo bêtir analîzê keyword_suggestions.csv bikar bînin:
    • yekem_ dît: ev tarîxa ku lêpirsîn yekem car di xweserkirinê de xuya bû ye
    • last_seen: tarîxa ku lêpirsîn ji bo cara dawî hate dîtin
    • ye_new: heke yekem_bîn == last_seen me ev danî ser rast - Tenê li ser vê nirxê parzûn bikin da ku hûn lêgerînên nû yên trending di otosuggest Google de bistînin.

Li vir Koda Python heye

# Pemavor.com Autocomplete Trends
# Author: Stefan Neefischer (stefan.neefischer@gmail.com)
import concurrent.futures
from datetime import date
from datetime import datetime
import pandas as pd
import itertools
import requests
import string
import json
import time

charList = " " + string.ascii_lowercase + string.digits

def makeGoogleRequest(query):
    # If you make requests too quickly, you may be blocked by google 
    time.sleep(WAIT_TIME)
    URL="http://suggestqueries.google.com/complete/search"
    PARAMS = {"client":"opera",
            "hl":LANGUAGE,
            "q":query,
            "gl":COUNTRY}
    response = requests.get(URL, params=PARAMS)
    if response.status_code == 200:
        try:
            suggestedSearches = json.loads(response.content.decode('utf-8'))[1]
        except:
            suggestedSearches = json.loads(response.content.decode('latin-1'))[1]
        return suggestedSearches
    else:
        return "ERR"

def getGoogleSuggests(keyword):
    # err_count1 = 0
    queryList = [keyword + " " + char for char in charList]
    suggestions = []
    for query in queryList:
        suggestion = makeGoogleRequest(query)
        if suggestion != 'ERR':
            suggestions.append(suggestion)

    # Remove empty suggestions
    suggestions = set(itertools.chain(*suggestions))
    if "" in suggestions:
        suggestions.remove("")
    return suggestions

def autocomplete(csv_fileName):
    dateTimeObj = datetime.now().date()
    #read your csv file that contain keywords that you want to send to google autocomplete
    df = pd.read_csv(csv_fileName)
    keywords = df.iloc[:,0].tolist()
    resultList = []

    with concurrent.futures.ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        futuresGoogle = {executor.submit(getGoogleSuggests, keyword): keyword for keyword in keywords}

        for future in concurrent.futures.as_completed(futuresGoogle):
            key = futuresGoogle[future]
            for suggestion in future.result():
                resultList.append([key, suggestion])

    # Convert the results to a dataframe
    suggestion_new = pd.DataFrame(resultList, columns=['Keyword','Suggestion'])
    del resultList

    #if we have old results read them
    try:
        suggestion_df=pd.read_csv("keyword_suggestions.csv")
        
    except:
        suggestion_df=pd.DataFrame(columns=['first_seen','last_seen','Keyword','Suggestion'])
    
    suggestionCommon_list=[]
    suggestionNew_list=[]
    for keyword in suggestion_new["Keyword"].unique():
        new_df=suggestion_new[suggestion_new["Keyword"]==keyword]
        old_df=suggestion_df[suggestion_df["Keyword"]==keyword]
        newSuggestion=set(new_df["Suggestion"].to_list())
        oldSuggestion=set(old_df["Suggestion"].to_list())
        commonSuggestion=list(newSuggestion & oldSuggestion)
        new_Suggestion=list(newSuggestion - oldSuggestion)
         
        for suggest in commonSuggestion:
            suggestionCommon_list.append([dateTimeObj,keyword,suggest])
        for suggest in new_Suggestion:
            suggestionNew_list.append([dateTimeObj,dateTimeObj,keyword,suggest])
    
    #new keywords
    newSuggestion_df = pd.DataFrame(suggestionNew_list, columns=['first_seen','last_seen','Keyword','Suggestion'])
    #shared keywords with date update
    commonSuggestion_df = pd.DataFrame(suggestionCommon_list, columns=['last_seen','Keyword','Suggestion'])
    merge=pd.merge(suggestion_df, commonSuggestion_df, left_on=["Suggestion"], right_on=["Suggestion"], how='left')
    merge = merge.rename(columns={'last_seen_y': 'last_seen',"Keyword_x":"Keyword"})
    merge["last_seen"].fillna(merge["last_seen_x"], inplace=True)
    del merge["last_seen_x"]
    del merge["Keyword_y"]
    
    #merge old results with new results
    frames = [merge, newSuggestion_df]
    keywords_df =  pd.concat(frames, ignore_index=True, sort=False)
    # Save dataframe as a CSV file
    keywords_df['first_seen'] = pd.to_datetime(keywords_df['first_seen'])
    keywords_df = keywords_df.sort_values(by=['first_seen','Keyword'], ascending=[False,False])   
    keywords_df['first_seen']= pd.to_datetime(keywords_df['first_seen'])
    keywords_df['last_seen']= pd.to_datetime(keywords_df['last_seen'])
    keywords_df['is_new'] = (keywords_df['first_seen']== keywords_df['last_seen'])
    keywords_df=keywords_df[['first_seen','last_seen','Keyword','Suggestion','is_new']]
    keywords_df.to_csv('keyword_suggestions.csv', index=False)

# If you use more than 50 seed keywords you should slow down your requests - otherwise google is blocking the script
# If you have thousands of seed keywords use e.g. WAIT_TIME = 1 and MAX_WORKERS = 5
WAIT_TIME = 0.2
MAX_WORKERS = 20
# set the autocomplete language
LANGUAGE = "en"
# set the autocomplete country code - DE, US, TR, GR, etc..
COUNTRY="US"
# Keyword_seed csv file name. One column csv file.
#csv_fileName="keyword_seeds.csv"
CSV_FILE_NAME="keywords.csv"
autocomplete(CSV_FILE_NAME)
#The result will save in keyword_suggestions.csv csv file

Skrîpta Python dakêşin

Hûn çi difikirin?

Ev malpera Akismet bikar tîne ku ji bo kêmkirina spam. Zêdetir agahdariya danûstandinên we çawa pêvajoy kirin.