Python, GKT e Konachan.com [Python]

Python, GKT e Konachan.com

Publicado por Renan (última atualização em 05/04/2010)

[ Hits: 6.982 ]

Download konachan-update.py

0 0

Denuncie Favoritos Indicar

O site de wallpapers de anime Konachan ( http://konachan.com/ ) organiza seus arquivos da seguinte forma:

Konachan.com - {id} {tags}.{ext}

Um problema desse método é que qualquer usuário pode alterar as tags (sujeito a moderação) e consequentemente alterar o nome do arquivo.

Então se eu baixar um arquivo e posteriormente alguém alterar uma das tags, o arquivo no servidor terá um nome diferente (e mais preciso em relação ao conteúdo do wallpaper).

Para remediar essa situação, criei um script em Python que se conecta a API Danbooru usada pelo site Konachan ( http://danbooru.donmai.us/help/api ) e faz uma serie de verificações para atualizar meus arquivos e me informar caso algum tenha sido removido.

Posteriormente adicionei suporte ao site http://moe.imouto.org/ (2 linhas de código e umas adaptações). Adicionei também uma barra de progresso usando minha correção do main loop gtk para tarefas longas.

Comentei o arquivo em inglês.

Esconder código-fonte

#!/usr/bin/env python
# Copyright (C) 2010 <Renan Vedovato Traba> <hellupline@gmail.com>
#This program is free software: you can redistribute it and/or modify it 
#under the terms of the GNU General Public License version 3, as published 
#by the Free Software Foundation.
#
#This program is distributed in the hope that it will be useful, but 
#WITHOUT ANY WARRANTY; without even the implied warranties of 
#MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR 
#PURPOSE.  See the GNU General Public License for more details.
#
#You should have received a copy of the GNU General Public License along 
#with this program.  If not, see <http://www.gnu.org/licenses/>.

import os, re, json, urllib, urllib2, gtk

def process_loading():
"""Emulate a gtk main loop"""
    while gtk.events_pending():
        gtk.main_iteration(False)

def load_web_file(url):
"""Download file """
    while True:
        try:
            return urllib2.urlopen(url).read()
        except:
            pass
    return None

"""Directory to search"""
directory = "/home/hellupline/Wallpapers/Exclusive"
directory = "/home/hellupline/Wallpapers"

search = "moe "
base_url = "http://moe.imouto.org/post/index.json?tags=id:"

search = "Konachan.com - "
base_url = "http://konachan.com/post/index.json?tags=id:"

"""Logs lists"""
file_list = [f for f in os.listdir(directory) if f.startswith(search)]
id_list = []
valid_id_list = []
not_valid_id_list = []
url_list = []
up_to_date_file_list = []
up_to_date_id_list = []
up_to_date_url_list = []
not_up_to_date_file_list = []
not_up_to_date_id_list = []
not_up_to_date_url_list = []
to_remove_list = []
repeted_id_list = []
repeted_file_list = []

"""This list contain all ids(some of then can be repeted)"""
id_list = [re.escape(search) + '(?P<id>[0-9]+)', f).group("id") for f in file_list]

total = len(set(id_list))

"""This wil create a gtk window with a progress bar"""
pbar = gtk.ProgressBar()
window = gtk.Window(gtk.WINDOW_TOPLEVEL)
window.add(pbar)
window.resize(260, 60)
window.set_position(gtk.WIN_POS_CENTER_ALWAYS)
window.set_decorated(False)
window.show_all()
process_loading()

for i, id in enumerate(set(id_list)):
    url = base_url + str(id) #API url
    web_file = load_web_file(url) #the json from the API

    """Calculate porcentage"""
    fraction = float(i+1)/total
    pbar.set_fraction(fraction)
    pbar.set_text(str(i) + " - " + str(total) + " - " + str(int(100 * fraction)) + "%")
    process_loading()

    """This will try to get a link(the file url), if any"""
    try:
        link = json.loads(web_file)[0]["file_url"]
    except:
        link = ""
    if link:
        link = urllib.unquote(link)
        valid_id_list.append(id)
        url_list.append(link)
        if os.path.basename(link) in file_list:
            up_to_date_file_list.append(os.path.basename(link)) #filenames with the names up-to-date with the server
            up_to_date_id_list.append(id) #ids of files with the names up-to-date with the server
            up_to_date_url_list.append(link) #urls of files with the names up-to-date with the server
        else:
            not_up_to_date_file_list.append(os.path.basename(link)) #filenames with the names not-up-to-date with the server
            not_up_to_date_id_list.append(id) #ids of files with the names not-up-to-date with the server
            not_up_to_date_url_list.append(link) #urls of files with the names not-up-to-date with the server
    else:
        not_valid_id_list.append(id) #ids from files that are deleted from server

window.destroy()

"""List with the files that will be removed to be replaced be the new ones"""
to_remove_list = [filename for filename in file_list for item in not_up_to_date_id_list if filename.startswith(search + str(item))]

"""List of repeted ids and files (2 or more files with same id(to danbooru based boards this means same file))"""
repeted_id_list = set([ item for item in id_list if id_list.count(item) > 1 ])
repeted_file_list = [filename for filename in file_list for item in repeted_id_list if filename.startswith(search + str(item))]

write_list = [
    #["file_list", file_list],
    #["id_list", set(id_list)],
    #["valid_id_list", valid_id_list],
    #["not_valid_id_list", not_valid_id_list],
    #["url_list", url_list],
    #["up_to_date_file_list", up_to_date_file_list],
    #["up_to_date_id_list", up_to_date_id_list],
    #["up_to_date_url_list", up_to_date_url_list],
    #["not_up_to_date_file_list", not_up_to_date_file_list],
    #["not_up_to_date_id_list", not_up_to_date_id_list],
    ["not_up_to_date_url_list", not_up_to_date_url_list],
    ["to_remove_list", to_remove_list],
    #["repeted_id_list", repeted_id_list],
    ["repeted_file_list", repeted_file_list],
]

"""Record the logs"""
for title, item in write_list:
    file = open("Desktop/"+title, "w")
    file.write("\n".join(item))
    file.close()

Scripts recomendados

LISCH e EISCH - Método de resolução de colisão

Consumo de memória por processo

Leitura de todos os valores de qualquer xml

Inteligência artificial com Python e Shell Script

Gerador palpite Mega Sena v1.0

Comentários

Nenhum comentário foi encontrado.