Files
k3s_auto_deploy/DEPLOYMENT-GUIDE.md

462 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# K3s + GitOps 全自动化部署指南
面对一个新的集群如何从零开始一步步搭建完整的K3s + GitOps自动化环境。
## 📋 目录
- [概述](#概述)
- [前置准备](#前置准备)
- [快速开始](#快速开始)
- [详细步骤](#详细步骤)
- [验证部署](#验证部署)
- [故障排查](#故障排查)
---
## 概述
本项目提供了一套完整的K3s + GitOps自动化部署方案适用于从零开始搭建新集群。
### 部署内容
- **K3s集群**: 轻量级Kubernetes集群1 master + N worker
- **Gitea**: 私有Git服务器
- **ArgoCD**: GitOps持续部署工具
- **cert-manager**: 自动SSL证书管理
- **Traefik**: Ingress控制器K3s内置
### 特点
- ✅ 完全自动化部署
- ✅ 支持幂等性(可重复执行)
- ✅ GitOps工作流
- ✅ HTTPS自动配置
- ✅ 配置参数化
---
## 前置准备
### 1. 硬件要求
| 角色 | 最低配置 | 推荐配置 |
|------|---------|---------|
| Master节点 | 2核4G | 4核8G |
| Worker节点 | 2核4G | 4核8G |
### 2. 软件要求
**控制机(本地机器)**:
```bash
# Ubuntu/Debian
sudo apt update
sudo apt install -y python3 python3-pip git ansible sshpass
# macOS
brew install python3 ansible git
```
**目标节点K3s集群**:
- Ubuntu 20.04+ / Debian 11+
- SSH访问权限
- sudo权限
### 3. 准备配置信息
需要收集以下信息:
```yaml
# 节点信息
- Master节点: 公网IP、内网IP、SSH用户、SSH密码
- Worker节点: 公网IP、内网IP、SSH用户、SSH密码
# 域名可选用于HTTPS
- 主域名: example.com
- ArgoCD域名: argocd.example.com
- Gitea域名: git.example.com
# 密码
- ArgoCD管理员密码
- Gitea管理员密码
```
---
## 快速开始
### 一键部署(推荐)
```bash
# 1. 克隆项目
git clone <your-repo-url>
cd k3s自动化部署
# 2. 配置集群信息
cp config/cluster-vars.yml.example config/cluster-vars.yml
vim config/cluster-vars.yml # 填写实际配置
# 3. 执行部署
chmod +x scripts/*.sh
./scripts/deploy-all.sh
```
---
## 详细步骤
### 步骤1: 配置集群参数
#### 1.1 创建配置文件
```bash
cd k3s自动化部署
cp config/cluster-vars.yml.example config/cluster-vars.yml
vim config/cluster-vars.yml
```
#### 1.2 配置示例
```yaml
# 节点配置
master_nodes:
- hostname: k3s-master-01
public_ip: "8.216.38.248" # 改为你的公网IP
private_ip: "172.23.96.138" # 改为你的内网IP
ssh_user: "root" # 改为你的SSH用户
ssh_password: "your-password" # 改为你的SSH密码
worker_nodes:
- hostname: k3s-worker-01
public_ip: "8.216.41.97"
private_ip: "172.23.96.139"
ssh_user: "root"
ssh_password: "your-password"
# K3s配置
k3s_version: "v1.28.5+k3s1"
flannel_iface: "eth0"
target_dir: "/home/fei/k3s"
# 域名配置(可选)
domain_name: "example.com"
argocd_domain: "argocd.example.com"
gitea_domain: "git.example.com"
# 密码配置
argocd_admin_password: "YourStrongPassword123!"
gitea_admin_password: "YourStrongPassword123!"
```
### 步骤2: 部署K3s集群
```bash
# 生成inventory
python3 scripts/generate-inventory.py
# 部署K3s
cd k3s-ansible
ansible-playbook site.yml -i inventory/hosts.ini
# 验证集群
ssh user@master-ip
kubectl get nodes
```
**预期输出**:
```
NAME STATUS ROLES AGE VERSION
k3s-master-01 Ready control-plane,master 5m v1.28.5+k3s1
k3s-worker-01 Ready <none> 4m v1.28.5+k3s1
```
### 步骤3: 部署Gitea
```bash
# 部署Gitea
./scripts/deploy-gitea.sh
# 初始化Gitea
./scripts/setup-gitea.sh
# 获取访问地址
kubectl get svc -n gitea gitea-http
```
**访问**: `http://<MASTER_IP>:<NODEPORT>`
### 步骤4: 部署ArgoCD
```bash
# 部署ArgoCD
./scripts/deploy-argocd.sh
# 获取访问地址
kubectl get svc -n argocd argocd-server
```
**访问**: `https://<MASTER_IP>:<NODEPORT>`
### 步骤5: 配置HTTPS可选
#### 5.1 前置条件
- DNS已解析到master节点公网IP
- 云服务器安全组已开放80和443端口
#### 5.2 部署cert-manager
```bash
# 在master节点上执行
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.3/cert-manager.yaml
# 等待就绪
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=cert-manager -n cert-manager --timeout=300s
```
#### 5.3 创建ClusterIssuer
```bash
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: traefik
EOF
```
#### 5.4 配置ArgoCD HTTPS
```bash
# 配置ArgoCD insecure模式
kubectl patch configmap argocd-cmd-params-cm -n argocd --type merge -p '{"data":{"server.insecure":"true"}}'
# 重启ArgoCD
kubectl rollout restart deployment argocd-server -n argocd
# 创建HTTPS Ingress
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: argocd-server-tls
namespace: argocd
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: websecure
traefik.ingress.kubernetes.io/router.tls: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- argocd.example.com
secretName: argocd-tls-cert
rules:
- host: argocd.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: argocd-server
port:
number: 80
EOF
```
### 步骤6: 创建示例应用
```bash
# 在master节点上执行
cd /home/fei/k3s
mkdir -p demo-app/manifests
# 创建Deployment
cat > demo-app/manifes/deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-nginx
spec:
replicas: 2
selector:
matchLabels:
app: demo-nginx
template:
metadata:
labels:
app: demo-nginx
spec:
containers:
- name: nginx
image: nginx:1.25-alpine
ports:
- containerPort: 80
EOF
# 推送到Gitea
cd demo-app
git init -b main
git add .
git commit -m "Initial commit"
GITEA_PORT=$(kubectl get svc gitea-http -n gitea -o jsonpath='{.spec.ports[0].nodePort}')
git remote add origin http://argocd:ArgoCD%402026@localhost:$3s-apps/demo-app.git
git push -u origin main
# 创建ArgoCD应用
kubectl apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: demo-app
namespace: argocd
spec:
project: default
source:
repoURL: http://gitea-http.gitea.svc.cluster.local:3000/k3s-apps/demo-app.git
targetRevision: main
path: manifests
destination:
server: https://kubernetes.default.svc
namespace: default
syncPolicy:
automated:
prune: true
selfHeal: true
EOF
```
---
## 验证部署
### 完整验证清单
```bash
# 1. 验证K3s集群
kubectl get nodes # 所有节点应为Ready
kubectl get pods -A # 所有Pod应为Running
# 2. 验证Gitea
kubectl get pods -n gitea # Gitea Pod应为Running
# 3. 验证ArgoCD
kubectl get pods -n argocd # 所有ArgoCD Pod应为Running
kubectl get application -n argocd # 应用应为Synced和Healthy
# 4. 验证HTTPS如果已配置
kubectl get certificate -A # 证书应为Ready
# 5. 验证GitOps工作流
# 修改应用配置 → 提交Git → 等待3分钟 → 自动部署
```
---
## 故障排查
### 问题1: 节点无法加入集群
```bash
# 检查网络连通性
ping <master-private-ip>
# 检查K3s服务
sudo systemctl status k3s-agent
sudo journalctl -u k3s-agent -f
```
### 问题2: 外部无法访问服务
```bash
# 检查云服务器安全组
# 开放端口: 80, 443, 30000-32767
# 从master节点内部测试
curl http://localhost:30080
```
### 问题3: HTTPS证书申请失败
```bash
# 检查DNS解析
nslookup argocd.example.com
# 检查cert-manager日志
kubectl logs -n cert-manager deployment/cert-manager
# 查看证书详情
kubectl describe certificate <cert-name> -n <namespace>
```
---
## 架构说明
```
┌─────────────────────────────────────────────────────────────┐
│ K3s Cluster │
│ │
│ Master + Workers │
│ ↓ │
│ Gitea (Git Server) │
│ ↓ │
│ ArgoCD (GitOps) │
│ ↓ │
│ cert-manager (SSL) │
│ ↓ │
│ Traefik (Ingress) │
│ ↓ │
│ Applications │
└─────────────────────────────────────────────────────────────┘
```
### GitOps工作流
```
开发者修改代码
提交到Git (Gitea)
ArgoCD检测变化 (3分钟内)
自动同步到集群
应用自动更新
```
---
## 总结
按照本指南您可以在30-60分钟内完成
- ✅ K3s集群部署
- ✅ Gitea私有Git服务器
- ✅ ArgoCD GitOps引擎
- ✅ HTTPS自动证书
- ✅ 示例应用
所有配置都支持幂等性,可以安全重复执行。
**祝您部署顺利!** 🚀
---
## 参考文档
- [README-DEPLOYMENT.md](README-DEPLOYMENT.md) - 详细部署文档
- [USAGE-GUIDE.md](USAGE-GUIDE.md) - 使用指南
- [SUMMARY.md](SUMMARY.md) - 完整总结
- [QUICK-REFERENCE.md](QUICK-REFERENCE.md) - 快速参考
- [TROUBLESHOOTING-ACCESS.md](TROUBLESHOOTING-ACCESS.md) - 访问问题排查