goblin

vLLM 部署 deepseek

March 17, 2025 · 5 min read

vLLM

vLLM 是一个快速且易于使用的 LLM 推理和服务库 vLLM（Very Large Language Model Serving）是由加州大学伯克利分校团队开发的高性能、低延迟的大语言模型（LLM）推理和服务框架。它专为大规模生产级部署设计，尤其擅长处理超长上下文（如8k+ tokens）和高并发请求，同时显著优化显存利用率，是当前开源社区中吞吐量最高的LLM推理引擎之一。官方网站详情可查看

我的环境信息

Detail	Description
CPU	64c
内存	500GiB
GPU	NVODIA A100 80G * 4
数据盘	500GiB
操作系统	Ubuntu 24.04

安装 CUDA

进入CUDA Toolkit Archive页面。
选择驱动对应的 CUDA 版本。

获取 CUDA 安装包下载地址。

CUDA Toolkit 安装

  wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-ubuntu2404.pin
  sudo mv cuda-ubuntu2404.pin /etc/apt/preferences.d/cuda-repository-pin-600
  wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda-repo-ubuntu2404-12-8-local_12.8.1-570.124.06-1_amd64.deb
  sudo dpkg -i cuda-repo-ubuntu2404-12-8-local_12.8.1-570.124.06-1_amd64.deb
  sudo cp /var/cuda-repo-ubuntu2404-12-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
  sudo apt-get update
  sudo apt-get -y install cuda-toolkit-12-8

驱动安装
```
  apt-get install -y cuda-drivers
```

配置CUDA环境变量

  echo 'export PATH=/usr/local/cuda/bin:$PATH' | sudo tee /etc/profile.d/cuda.sh
  source /etc/profile

检查CUDA是否成功安装
```
  nvcc -V
  # GPU 信息
  nvidia-smi
```

安装conda

运行vllm需要Python环境，推荐使用conda创建一个新的Python环境。

# (Recommended) Create a new conda environment.
conda create -n vllm python=3.12 -y
conda activate vllm

切换到环境中：

conda activate vllm

安装vllm

pip install vllm

模型下载

可以直接从 huggingface 下载模型，也可以从镜像站或者魔塔社区下载。

国内的huggingface镜像站时完全同步huggiingface.co，不存在版本延迟。魔塔的模型与huggingface可能存在微小差距。

安装huggingface-cli

pip install --upgrade huggingface_hub

# 如需切换镜像站
# export HF_ENDPOINT=https://hf-mirror.com
huggingface-cli download --resume-download adept/fuyu-8b --cache-dir ./path/to/cache

运行模型

对外提供服务，需要以serving模型启动vllm，以下是启动示例：

python -m vllm.entrypoints.openai.api_server \
  -- model "/data/models/DeepSeek-R1" \
  --served-model-name deepseek_r1 \
  --host 0.0.0.0 \
  --port 8080 \
  --max-model-len 4096 \    # 最大上下文长度
  --tensor-parallel-size 4 \  # GPU 数量
  --gpu-memory-utilization 0.95 \  # 推理过程显存占用比例，默认值 0.9
  --dtype float16 \    # 计算精度控制
  --trust-remote-code \
  --enforce-eager     # 禁用 CUDA 优化提升兼容性

使用 vllm serve 启动示例：

vllm serve /data/models/DeepSeek-R1 \
  --served-model-name deepseek_r1 \
  --host 0.0.0.0 \
  --port 8080 \
  --max-model-len 4096 \
  --tensor-parallel-size 4 \
  --gpu-memory-utilization 0.95 \
  --dtype float16 \
  --trust-remote-code \
  --enforce-eager

功能验证

curl --location 'http://127.0.0.1:8080/v1/chat/completions' --header 'Content-Type: application/json' --data '{
    "model": "deepseek_r1",
    "messages": [{"role": "user", "content": "hello"}]
}'

官方基准测试

git clone https://github.com/vllm-project/vllm.git
cd vllm/benchmarks

基准测试指标含义

指标	含义
Avg prompt throughput	输入吞吐量（Prompt Tokens/s），0.0 表示当前没有新的输入请求
Avg generation throughput	生成吞吐量（Generation Tokens/s），86.8 表示模型每秒生成 86.8 个 token
Running	正在处理的请求数（当前正在生成的请求）
Swapped	被换出的请求数（当显存不足时，某些请求会被移到 CPU）
Pending	等待中的请求数（尚未处理的请求）
GPU KV cache usage	GPU KV Cache 使用率，表示当前 GPU 的 key-value cache 使用情况，数值越高表示显存消耗越多

基准测试示例：

CUDA_VISIBLE_DEVICES=0,1,2,3 python benchmark_throughput.py \
  --model "/data/models/deepseek-70b" \
  --backend vllm \
  --input-len 4096 \
  --output-len 10000 \
  --num-prompts 50 \
  --seed 1100 \
  --dtype float16  \
  --tensor-parallel-size 2 \
  --gpu-memory-utilization 0.95 \
  --max-model-len 16384 \
  --cpu-offload-gb 10 \
  --enforce-eager 

K8s Ingress 自动管理证书

February 27, 2025 · 2 min read

goblin

前提条件

一个运行的kubernetes集群
一个阿里云账号，并且已经创建了一个 DNS 域名
阿里云的 AccessKey 和 SecretKey，用于 cert-manager 自动配置 DNS 记录

安装 Cert Manager

官网地址

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.0/cert-manager.yaml

DNS01

官网地址, http01 不支持泛域名

安装 alidns-webhook

helm repo add cert-manager-alidns-webhook https://devmachine-fr.github.io/cert-manager-alidns-webhook
helm repo update
helm install alidns-webhook cert-manager-alidns-webhook/alidns-webhook --set groupName=example.com

创建阿里云 DNS 访问权限

  apiVersion: v1
  kind: Secret
  metadata:
    name: alidns-secrets
    namespace: cert-manager
  stringData:
    access-key: xxx
    secret-key: xxx

创建 ClusterIssuer

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    email: [email protected]
    server: https://acme-v02.api.letsencrypt.org/directory  # 测试可以使用 staging (https://acme-staging-v02.api.letsencrypt.org/directory)
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - dns01:
        webhook:
            config:
              accessTokenSecretRef:
                key: access-key
                name: alidns-secrets
              regionId: cn-beijing # this value your aliyun region
              secretKeySecretRef:
                key: secret-key
                name: alidns-secrets
            groupName: example.com # groupName must match the one configured on webhook deployment (see Helm chart's values) !
            solverName: alidns-solver

创建 certification 使用 ClusterIssuer

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-tls
spec:
  secretName: example-com-tls
  dnsNames:
  - example.com
  - "*.example.com"
  issuerRef:
    name: letsencrypt
    kind: ClusterIssuer

配置 Ingress 自动申请证书

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: 'true'
spec:
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        backend:
          serviceName: app
          servicePort: 80
  tls:
  - hosts:
    - app.example.com
    secretName: example-com-tls

Etcd 备份与恢复

December 25, 2023 · One min read

goblin

查看集群状态

$ ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=https://10.0.1.2:2379,https://10.0.1.3:2379,https://10.0.1.4:2379 endpoint health

https://10.0.1.2:2379 is healthy: successfully committed proposal: took = 1.698385ms 
https://10.0.1.3:2379 is healthy: successfully committed proposal: took = 1.577913ms 
https://10.0.1.4:2379 is healthy: successfully committed proposal: took = 5.616079ms

获取某个 Key 信息

ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=https://10.0.1.2:2379,https://10.0.1.3:2379,https://10.0.1.4:2379 get /registry/apiregistration.k8s.io/apiservices/v1.apps

获取所有 Key

ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=https://10.0.1.2:2379,https://10.0.1.3:2379,https://10.0.1.4:2379 get / --prefix --keys-only

使用 Snapshot Save 备份

ETCDCTL_API=3 etcdctl --cacert=/opt/kubernetes/ssl/ca.pem --cert=/opt/kubernetes/ssl/server.pem --key=/opt/kubernetes/ssl/server-key.pem --endpoints=https://10.0.1.2:2379,https://10.0.1.3:2379,https://10.0.1.4:2379 snapshot save /data/etcd_backup/etcd-snapshot-`date +%Y%m%d`.db

备份保留 10 天

find /data/etcd_backup/ -name *.db -mtime +10 -exec rm -f {} \;

恢复备份

拷贝etcd备份快照，停止集群所有kube-apiserver服务，停止集群所有ETCD服务

ETCDCTL_API=3 etcdctl snapshot restore /data/etcd_backup/etcd-snapshot-20231225.db \
  --name etcd-0 \
  --initial-cluster "etcd-0=https://10.0.1.2:2380,etcd-1=https://10.0.1.3:2380,etcd-2=https://10.0.1.4:2380" \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls https://10.0.1.2:2380 \
  --data-dir=/var/lib/etcd/default.etcd

Mysql 常用语句

December 14, 2023 · 2 min read

goblin

Mysql的infromation_schema库，可以查询数据库中每个表占用的空间、表记录行数

TABLE_SCHEMA: 数据库名
TABLE_NAME: 表名
ENGINE: 使用的存储引擎
TABLES_ROWS: 记录数
DATA_LENGTH: 数据大小
INDEX_LENGTH: 索引大小

查看所有库大小

use information_schema;
select concat(round(sum(DATA_LENGTH/1024/1024),2),'MB') as data  from TABLES;

查看指定库大小

select concat(round(sum(DATA_LENGTH/1024/1024),2),'MB') as data  from TABLES where table_schema='xxx';

查看指定库的指定表的大小

select concat(round(sum(DATA_LENGTH/1024/1024),2),'MB') as data  from TABLES where table_schema='xxx' and table_name='xxx';

查看指定库的索引大小

SELECT CONCAT(ROUND(SUM(index_length)/(1024*1024), 2), ' MB') AS 'Total Index Size' FROM TABLES  WHERE table_schema = 'xxx';

查看指定库的指定表的索引大小

SELECT CONCAT(ROUND(SUM(index_length)/(1024*1024), 2), ' MB') AS 'Total Index Size' FROM TABLES  WHERE table_schema = 'xxx' and table_name='xxx';

查看一个库中的情况

SELECT CONCAT(table_schema,'.',table_name) AS 'Table Name', CONCAT(ROUND(table_rows/1000000,4),'M') AS 'Number of Rows', CONCAT(ROUND(data_length/(1024*1024*1024),4),'G') AS 'Data Size', CONCAT(ROUND(index_length/(1024*1024*1024),4),'G') AS 'Index Size', CONCAT(ROUND((data_length+index_length)/(1024*1024*1024),4),'G') AS'Total'FROM information_schema.TABLES WHERE table_schema LIKE 'xxx';

查看非 Sleep 状态的链接，按消耗时间倒序展示

使用show full processlist可以查看所有链接情况

select id, db, user, host, command, time, state, info
from information_schema.processlist
where command != 'Sleep'
order by time desc; 

查询执行时间超过2分钟的线程，然后拼接成kill语句

select concat('kill ', id, ';')
from information_schema.processlist
where command != 'Sleep'
and time > 2*60
order by time desc;

快速杀死所有进程

mysql -e "show full processlist;" -ss | awk '{print "KILL "$1";"}'| mysql

Nginx 常用配置

December 10, 2023 · 6 min read

goblin

获取客户端真实IP、域名、协议、端口

proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

Host 包含客户端真实的域名和端口号；
X-Forwarded-Proto 表示客户端真实的协议；
X-Real-IP 表示客户端真实的IP；
X-Forwarded-For 和 X-Real-IP 类似，但它在多层代理时会包含真实客户端及中间每个代理服务器的IP；

负载均衡配置

在fail_timeout时间内失败了max_fails次请求后，将该服务地址剔除掉，fail_tiemout时间后会再次将该服务器加入存活列表，进行重试

http {
    upstream server_name {
    server IP:Port weight=1 max_fails=2 fail_timeout=60s;
    server IP:Port weight=2 max_fails=2 fail_timeout=60s;
    }
    server {
        listen  80;
        location / {
            proxy_pass http://server_name;
        }
    }
}

静态资源缓存

location ~* .(gif|jpg|jpeg|bmp|png|ico|txt|js|css)$ {
    expires      3d;
    add_header Static Nginx-Proxy;
}

动态黑名单

一般配置

location / {
    deny  192.168.1.1;
    deny  192.168.1.0/24;
    allow 10.0.0.0/16;
    allow 2001:0db8::/32;
    deny  all;
}

Lua+Redis动态黑名单(OpenResty)

yum install yum-utils
yum-config-manager --add-repo https://openresty.org/package/centos/openresty.repo
yum install openresty
yum install openresty-resty
yum --disablerepo="*" --enablerepo="openresty" list available
service openresty start

配置(/usr/local/openresty/nginx/conf/nginx.conf)

lua_shared_dict ip_blacklist 1m;

server {
    listen  80;

    location / {
        access_by_lua_file lua/ip_blacklist.lua;
        proxy_pass http://server_name;
    }
}

ip_blacklist.lua

local redis_host    = "192.168.1.100"
local redis_port    = 6379
local redis_pwd     = 123456
local redis_db = 1

-- connection timeout for redis in ms.
local redis_connection_timeout = 100

-- a set key for blacklist entries
local redis_key     = "ip_blacklist"

-- cache lookups for this many seconds
local cache_ttl     = 60

-- end configuration

local ip                = ngx.var.remote_addr
local ip_blacklist      = ngx.shared.ip_blacklist
local last_update_time  = ip_blacklist:get("last_update_time");

-- update ip_blacklist from Redis every cache_ttl seconds:
if last_update_time == nil or last_update_time < ( ngx.now() - cache_ttl ) then

  local redis = require "resty.redis";
  local red = redis:new();

  red:set_timeout(redis_connect_timeout);

  local ok, err = red:connect(redis_host, redis_port);
  if not ok then
    ngx.log(ngx.ERR, "Redis connection error while connect: " .. err);
  else
    local ok, err = red:auth(redis_pwd)
    if not ok then
      ngx.log(ngx.ERR, "Redis password error while auth: " .. err);
    else
        local new_ip_blacklist, err = red:smembers(redis_key);
        if err then
            ngx.log(ngx.ERR, "Redis read error while retrieving ip_blacklist: " .. err);
        else
        ngx.log(ngx.ERR, "Get data success:" .. new_ip_blacklist)
          -- replace the locally stored ip_blacklist with the updated values:
            ip_blacklist:flush_all();
          for index, banned_ip in ipairs(new_ip_blacklist) do
            ip_blacklist:set(banned_ip, true);
          end
          -- update time
          ip_blacklist:set("last_update_time", ngx.now());
      end
    end
  end
end

if ip_blacklist:get(ip) then
  ngx.log(ngx.ERR, "Banned IP detected and refused access: " .. ip);
  return ngx.exit(ngx.HTTP_FORBIDDEN);
end

Websocket

map $http_upgrade $connection_upgrade { 
    default upgrade; 
    '' close; 
} 
upstream ws_backend{ 
    server IP:Port; 
    keepalive 1000; 
}
server { 
    listen 80; 
    location /{ 
        proxy_http_version 1.1; 
        proxy_pass http://ws_backend; 
        proxy_redirect off; 
        proxy_set_header Host $host; 
        proxy_set_header X-Real-IP $remote_addr; 
        proxy_read_timeout 3600s; 
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 
        proxy_set_header Upgrade $http_upgrade; 
        proxy_set_header Connection $connection_upgrade; 
    } 
}

nginx 的转发规则

location [=|~|~*|^~] /uri/ { ... }

= 严格匹配，如果请求匹配这个location，那么将停止搜索立即处理请求
~ 区分大小写匹配（可用正则表达式）
~* 不区分大小写匹配（可用正则表达式）
!~ 区分大小写不匹配
!~* 不区分大小写不匹配
^~ 如果把这个前缀用于一个常规字符串,那么告诉nginx如果路径匹配那么不测试正则表达式

nginx.conf 配置

user  nginx;
worker_processes  auto;
worker_rlimit_nofile  65535;

error_log  /data/logs/nginx/error.log notice;
pid        /var/run/nginx.pid;
events {
    worker_connections  65535;
    multi_accept on;
    use epoll;
}
http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    log_format  main  '[$time_local] RemoteAddr:"$remote_addr" RemoteUser:"$remote_user" Host:"$host" '
                      'RequestUil:"$request" HttpStatus:"$status" BodyBytesSent:"$body_bytes_sent" '
                      'HttpReferer:"$http_referer" HttpUserAgent:"$http_user_agent" '
                      'Http_X_ForwardedFor:"$http_x_forwarded_for" UpstreamResponseTime:"$upstream_response_time" '
                      'UpstreamAddr:"$upstream_addr" RequestTime:"$request_time" --- $server_port';
    log_format  json  '{'
        '"RemoteAddr":"$remote_addr",'
        '"RemoteUser":"$remote_user",'
        '"TimeLocal":"$time_local",'
        '"RequestUil":"$request",'
        '"HttpHost":"$http_host"'
        '"HttpStatus":"$status",'
        '"BodyBytesSent":"$body_bytes_sent",'
        '"HttpReferer":"$http_referer",'
        '"HttpUserAgent":"$http_user_agent",'
        '"Http_X_ForwardedFor":"$http_x_forwarded_for",'
        '"SslProtocol":"$ssl_protocol"'
        '"SslCipher":"$ssl_cipher"'
        '"UpstreamResponseTime":"$upstream_response_time",'
        '"UpstreamAddr":"$upstream_addr",'
        '"RequestTime":"$request_time",'
    '}';

    access_log /data/logs/nginx/access.log main;

    access_log                      off;
    server_tokens                   off;
    sendfile                         on;
    tcp_nopush                       on;
    tcp_nodelay                      on;
    send_timeout                    300;
    keepalive_timeout               300;
    resolver_timeout                 60;
    server_names_hash_max_size      512;
    server_names_hash_bucket_size   128;

    client_body_timeout             300;
    client_header_timeout           300;
    client_header_buffer_size      512k;
    client_max_body_size           300m;
    large_client_header_buffers  8  32k;
    client_body_buffer_size        256k;

    fastcgi_connect_timeout         300;
    fastcgi_send_timeout            300;
    fastcgi_read_timeout            300;
    fastcgi_buffer_size            128k;
    fastcgi_buffers          8     256k;
    fastcgi_busy_buffers_size      256k;
    fastcgi_temp_file_write_size   256k;
    fastcgi_temp_path /tmp/ngx_fcgi_tmp;
    fastcgi_cache_path /tmp/fcgi_cache_path levels=1:2 keys_zone=ngx_fcgi_cache:512m inactive=1d max_size=10g;

    gzip  on;
    gzip_http_version   1.1;
    gzip_min_length      1k;
    gzip_buffers 4      16k;
    gzip_comp_level       9;
    gzip_types text/plain application/json application/x-javascript text/css application/xml text/javascript application/x-httpd-php image/jpeg image/gif image/png;
    gzip_vary            on;
    gzip_disable "MSIE [1-6]\.";

    proxy_http_version           1.1;
    proxy_set_header Connection   "";
    proxy_set_header     Host  $host;
    proxy_connect_timeout        300;
    proxy_read_timeout           300;
    proxy_send_timeout           300;
    proxy_buffering               on;
    proxy_buffer_size           128k;
    proxy_buffers            8  128k;
    proxy_busy_buffers_size     256k;
    proxy_temp_file_write_size  256k;
    proxy_temp_path  /tmp/proxy_temp_path;
    proxy_cache_path /tmp/proxy_cache_path levels=1:2 keys_zone=ngx_proxy_cache:512m inactive=1d max_size=10g;
    include /etc/nginx/conf.d/*.conf;
}
include /etc/nginx/stream.d/*.conf;

stream 配置

stream {
    upstream server_name {
        server  IP:Port;
    }
    server {
        listen Port;
        proxy_pass server_name;
        proxy_connect_timeout 1h;
        proxy_timeout 1h;
    }
}

server 配置

server {
    listen   80;
    listen   81;
    server_name 127.0.0.1  example.com;
    index  index.php index.html index.htm;
    root /data/www;
    charset utf-8;
    access_log  /data/logs/example.com.log  main;

    location / {
        if (!-e $request_filename) {
            rewrite ^/(.*) /index.php?$1 last;
        }
    }
    location /xxx/ {
        if ($arg_icpid = "4pd1mtsDhfe" ) {
            proxy_pass http://127.0.0.1:38888/test$request_uri&tbid=aldIthSBg04;
        }
        proxy_pass http://127.0.0.1:9302;
    }
    location /xxx/ {
        proxy_pass http://127.0.0.1:38888/;
    }
    location ~ \.php$ {
        fastcgi_param  REMOTE_ADDR $http_x_real_ip;
        fastcgi_param LY_ADDRESS $remote_addr;
        fastcgi_pass   unix:/dev/shm/php-cgi.sock;
        fastcgi_index  index.php;
        fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
        fastcgi_param  SERVERNAME     $hostname;
        include        fastcgi_params;
   }
   location = /favicon.ico {
        log_not_found off;
        access_log off;
   }
   location = /robots.txt {
        allow all;
        log_not_found off;
        access_log off;
   }
}

server {
    listen   80;
    listen   443 ssl;
    server_name  example.com;
    charset utf-8;
    access_log  /data/logs/example.com.log main;

    ssl_certificate   /etc/nginx/cert/example.com.pem;
    ssl_certificate_key  /etc/nginx/cert/example.com.key;
    ssl_session_timeout 5m;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE:ECDH:AES:HIGH:!NULL:!aNULL:!MD5:!ADH:!RC4;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;

    location / {
        proxy_pass http://127.0.0.1:38888;
    }
    location = /favicon.ico {
        log_not_found off;
        access_log off;
    }
    location = /robots.txt {
        allow all;
        log_not_found off;
        access_log off;
    }
}

Istio 返回 426 状态码

September 16, 2023 · One min read

goblin

HTTP/1.0 版本

Istio 使用 Envoy 转发 HTTP 请求，而 Envoy 默认要求使用 HTTP/1.1 或 HTTP/2，当客户端使用 HTTP/1.0 时会返回426 low version

nginx 场景

用 nginx 进行proxy_pass反向代理，默认会用 HTTP/1.0，可以指定proxy_http_version为 1.1

server {
    ...
    location /xxx/ {
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Centos内核升级

September 8, 2023 · One min read

goblin

导入仓库源

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

查看可安装的软件包

# ML 版本为最新版本，TL 版本为稳定版本
yum --enablerepo="elrepo-kernel" list --showduplicates | sort -r | grep kernel-ml.x86_64

# 安装 ML 版本
yum --enablerepo=elrepo-kernel install  kernel-ml-devel kernel-ml -y

# 安装 LT 版本
yum --enablerepo=elrepo-kernel install kernel-lt-devel kernel-lt -y

查看现有内核启动顺序

awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg

CentOS Linux (4.4.179-1.el7.elrepo.x86_64) 7 (Core)

CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)

设置内核启动序号

grub2-set-default 0

July 16, 2023 · 0 min read

goblin

vLLM​

我的环境信息​

安装 CUDA​

安装conda​

模型下载​

运行模型​

功能验证​

官方基准测试​

相关链接​

前提条件​

安装 Cert Manager​

DNS01​

配置 Ingress 自动申请证书​

查看集群状态​

获取某个 Key 信息​

获取所有 Key​

使用 Snapshot Save 备份​

备份保留 10 天​

恢复备份​

查看所有库大小​

查看指定库大小​

查看指定库的指定表的大小​

查看指定库的索引大小​

查看指定库的指定表的索引大小​

查看一个库中的情况​

查看非 Sleep 状态的链接，按消耗时间倒序展示​

查询执行时间超过2分钟的线程，然后拼接成kill语句​

快速杀死所有进程​

获取客户端真实IP、域名、协议、端口​

负载均衡配置​

静态资源缓存​

动态黑名单​

Websocket​

nginx 的转发规则​

nginx.conf 配置​

stream 配置​

server 配置​

HTTP/1.0 版本​

nginx 场景​

导入仓库源​

查看可安装的软件包​

查看现有内核启动顺序​

设置内核启动序号​

vLLM

我的环境信息

安装 CUDA

安装conda

模型下载

运行模型

功能验证

官方基准测试

相关链接

前提条件

安装 Cert Manager

DNS01

配置 Ingress 自动申请证书

查看集群状态

获取某个 Key 信息

获取所有 Key

使用 Snapshot Save 备份

备份保留 10 天

恢复备份

查看所有库大小

查看指定库大小

查看指定库的指定表的大小

查看指定库的索引大小

查看指定库的指定表的索引大小

查看一个库中的情况

查看非 Sleep 状态的链接，按消耗时间倒序展示

查询执行时间超过2分钟的线程，然后拼接成kill语句

快速杀死所有进程

获取客户端真实IP、域名、协议、端口

负载均衡配置

静态资源缓存

动态黑名单

Websocket

nginx 的转发规则

nginx.conf 配置

stream 配置

server 配置

HTTP/1.0 版本

nginx 场景

导入仓库源

查看可安装的软件包

查看现有内核启动顺序

设置内核启动序号