提供一个英文关键词提取模型

Published on Aug. 22, 2023, 12:11 p.m.

使用生成模式来做标签任务,模型可以处理512英文单词,去掉开头结尾和分词其他因素可以处理500字左右。原始数据集里包含里7万多的标签,如果按照分类来做基本不可行,所以这里使用了预训练的bart来生成。万能的seq2seq。

Automatically extract keywords and use models.

docker-compose.yaml文件内容如下

version: "2.1"
services:
  en_articles_tags:
    image: napoler/en_articles_tags:v01
    container_name: en_articles_tags
    ports:
      - 3014:3000
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: "4.00"

curl -X 'POST' \
  'http://192.168.1.18:3014/predict_articles_tags' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"config":{"do_sample": false},"text":"Cat food is food for consumption by cats. Cats have specific requirements for their dietary nutrients. Certain nutrients, including many vitamins and amino acids, are degraded by the temperatures, pressures and chemical treatments used during manufacture, and hence must be added after manufacture to avoid nutritional deficiency."}'

结果如下:

{
  "results": [
    "Cats",
    "Food",
    "Nutrition"
  ]
}

Tags: