From 154fddf95968c407df2792cd1ee04e9defcf9fe5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=EC=9D=B4=EC=86=8C=EC=9D=80?= <144209738+saokiritoni@users.noreply.github.com> Date: Fri, 2 Jan 2026 00:43:40 +0900 Subject: [PATCH] =?UTF-8?q?[docs]=20=EB=A6=AC=EB=93=9C=EB=AF=B8=20?= =?UTF-8?q?=EC=9E=91=EC=84=B1?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Added comprehensive documentation for the DGU AI LAB GPU Server Admin Backend project, including project introduction, features, technology stack, setup instructions, CI/CD pipeline, and troubleshooting guidelines. --- README.md | 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..2840e6e --- /dev/null +++ b/README.md @@ -0,0 +1,174 @@ +# ๐Ÿ–ฅ๏ธ DGU AI LAB GPU Server Admin Backend + +![Java](https://img.shields.io/badge/Java-17-blue?logo=openjdk&logoColor=white) +![Spring Boot](https://img.shields.io/badge/SpringBoot-3.2-green?logo=springboot&logoColor=white) +![JPA](https://img.shields.io/badge/JPA-Hibernate-red) +![Redis](https://img.shields.io/badge/Redis-MessageQueue-red?logo=redis&logoColor=white) +![MySQL](https://img.shields.io/badge/MySQL-8.0-4479A1?logo=mysql&logoColor=white) +![Docker](https://img.shields.io/badge/Docker-Container-blue?logo=docker&logoColor=white) + +> **๋™๊ตญ๋Œ€ํ•™๊ต GPU ์„œ๋ฒ„์‹ค ์ž์› ๊ด€๋ฆฌ ๋ฐ ์ž๋™ํ™” ์‹œ์Šคํ…œ** +> ์˜ ์‚ฌ์šฉ์ž์˜ ์„œ๋ฒ„ ์‹ ์ฒญ๋ถ€ํ„ฐ ๊ณ„์ • ์ƒ์„ฑ, ๋งŒ๋ฃŒ ์•ˆ๋‚ด, ์ž์› ํšŒ์ˆ˜(์‚ญ์ œ)๊นŒ์ง€์˜ ์ˆ˜๋ช… ์ฃผ๊ธฐ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐฑ์—”๋“œ ์„œ๋ฒ„์ž…๋‹ˆ๋‹ค. + +
+ +## ๐Ÿ“ ํ”„๋กœ์ ํŠธ ์†Œ๊ฐœ +๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ์ œํ•œ๋œ GPU ์„œ๋ฒ„ ์ž์›(Farm/Lab)์„ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ๊ฐœ๋ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. +๊ธฐ์กด์˜ ์ˆ˜๋™ ๊ด€๋ฆฌ ๋ฐฉ์‹์„ ํƒˆํ”ผํ•˜์—ฌ **Linux ๊ณ„์ • ์ƒ์„ฑ/์‚ญ์ œ ์ž๋™ํ™”**, **๋งŒ๋ฃŒ์ผ ๊ธฐ๋ฐ˜ ์ž๋™ ํšŒ์ˆ˜**, **Slack/Email ์•Œ๋ฆผ ์‹œ์Šคํ…œ**์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. + +
+ +## ๐Ÿ’ก ํ•ต์‹ฌ ๊ธฐ์ˆ ์  ํŠน์ง• +* **Automated Resource Lifecycle:** ์‹ ์ฒญ โ†’ ์Šน์ธ โ†’ ์ƒ์„ฑ โ†’ ๋งŒ๋ฃŒ ์ž„๋ฐ• ์•Œ๋ฆผ โ†’ ์ž๋™ ์‚ญ์ œ(Soft/Hard) ํ”„๋กœ์„ธ์Šค ๊ตฌ์ถ•. +* **Event-Driven Architecture:** DB ํŠธ๋žœ์žญ์…˜๊ณผ ์™ธ๋ถ€ ์•Œ๋ฆผ ๋ฐœ์†ก ๋กœ์ง์„ ๋ถ„๋ฆฌํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ •ํ•ฉ์„ฑ ๋ณด์žฅ +* **Non-blocking Notification:** Redis ๋ฉ”์‹œ์ง€ ํ๋ฅผ ํ™œ์šฉํ•œ ๋น„๋™๊ธฐ ์•Œ๋ฆผ ์ฒ˜๋ฆฌ๋กœ ๋Œ€๋Ÿ‰ ๋ฐœ์†ก ์‹œ ๋ถ€ํ•˜ ๋ฐฉ์ง€. +* **CQS Pattern:** Command์™€ Query๋ฅผ ๋ถ„๋ฆฌํ•˜์—ฌ UID/GID๋ฅผ ์œ ์ง€๋ณด์ˆ˜๊ฐ€ ์šฉ์ดํ•˜๋„๋ก ๊ด€๋ฆฌ. +
+ +## ๐ŸŒŸ ์ฃผ์š” ๊ธฐ๋Šฅ + +### 0. ์œ ์ € ๊ด€๋ฆฌ +- ์‚ฌ์šฉ์ž๋Š” ๋™๊ตญ๋Œ€ํ•™๊ต ์ด๋ฉ”์ผ์„ ํ†ตํ•ด ๊ฐ€์ž…. +- Spring Security & JWT ๊ธฐ๋ฐ˜์˜ ์‚ฌ์šฉ์ž/๊ด€๋ฆฌ์ž ๊ถŒํ•œ ๊ด€๋ฆฌ. +- ๊ด€๋ฆฌ์ž์šฉ ์ž๋™ ํƒˆํ‡ด ๋ฐ ์•Œ๋ฆผ ๊ธฐ๋Šฅ ์ง€์›. + +### 1. ์ž์› ์‹ ์ฒญ ๋ฐ ์Šน์ธ +- ์‚ฌ์šฉ์ž๋Š” ์›ํ•˜๋Š” GPU ์šฉ๋Ÿ‰, ๊ธฐ๊ฐ„, ์ด๋ฏธ์ง€๋ฅผ ์„ ํƒํ•˜์—ฌ ์‹ ์ฒญ. +- ๊ด€๋ฆฌ์ž ์Šน์ธ ์‹œ **UsedId(UID/GID)** ์ž๋™ ํ• ๋‹น ๋ฐ **Ubuntu ๊ณ„์ • ์ƒ์„ฑ API** ํ˜ธ์ถœ. + +### 2. ์ž๋™ํ™”๋œ ์Šค์ผ€์ค„๋Ÿฌ (๋งค์ผ 10:00 ์‹คํ–‰) +- **๋งŒ๋ฃŒ ์˜ˆ๊ณ :** ๋งŒ๋ฃŒ ์ „ ์ •ํ•ด์ง„ ๋‚ ์งœ(7, 3, 1์ผ ์ „)์— ์‚ฌ์šฉ์ž์—๊ฒŒ ์•Œ๋ฆผ ๋ฐœ์†ก. +- **์ž๋™ ํšŒ์ˆ˜:** ๋งŒ๋ฃŒ์ผ ๋„๋ž˜ ์‹œ Linux ๊ณ„์ • ์‚ญ์ œ, DB ๋ฐ์ดํ„ฐ ์ •๋ฆฌ(Cascade), UID ๋ฐ˜๋‚ฉ. + +### 3. ์•Œ๋ฆผ ์‹œ์Šคํ…œ (Slack & Email) +- **์‚ฌ์šฉ์ž:** ์‹ ์ฒญ ๊ฒฐ๊ณผ, ๋งŒ๋ฃŒ ์˜ˆ๊ณ , ์‚ญ์ œ ์™„๋ฃŒ ์•ˆ๋‚ด. +- **๊ด€๋ฆฌ์ž:** ์„œ๋ฒ„ ์˜ค๋ฅ˜, ์ž์› ์‚ญ์ œ ๋ฆฌํฌํŠธ (Lab/Farm ํƒœ๊ทธ ๊ตฌ๋ถ„). + +
+ +## ๐Ÿ›  ๊ธฐ์ˆ  ์Šคํƒ + +| ๋ถ„๋ฅ˜ | ๊ธฐ์ˆ  | ๋น„๊ณ  | +| :--- | :--- | :--- | +| **Language** | Java 17 | | +| **Framework** | Spring Boot 3.2 | Web, Security | +| **Database** | MySQL 8.0 | ์šด์˜ DB | +| **ORM** | Spring Data JPA | Hibernate 6.x | +| **Message Queue** | Redis | ์•Œ๋ฆผ ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ | +| **Infrastructure** | Docker, Linux | Ubuntu ๊ณ„์ • ์—ฐ๋™ | +| **Build Tool** | Gradle | | + +
+ +## ๐Ÿš€ ์‹คํ–‰ ๋ฐฉ๋ฒ• + +### 1. ์‚ฌ์ „ ์š”๊ตฌ์‚ฌํ•ญ (Prerequisites) +* Java 17+ +* Redis +* MySQL + +### 2. ๋กœ์ปฌ ํ™˜๊ฒฝ ์‹คํ–‰ +```bash +# 1. Repository Clone +git clone [https://github.com/DGU-AI-LAB/admin-be.git](https://github.com/DGU-AI-LAB/admin-be.git) +cd admin-be + +# 2. Redis & DB ์‹คํ–‰ (Docker ํ™œ์šฉ ์‹œ) +docker run -d -p 6379:6379 --name redis redis + +# 3. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋นŒ๋“œ ๋ฐ ์‹คํ–‰ +./gradlew clean build +java -jar build/libs/admin-be-0.0.1-SNAPSHOT.jar +``` + +### 3. โš™๏ธ ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ • +`src/main/resources/application.yml` ํŒŒ์ผ์— ๋…ธ์…˜์— ์ •๋ฆฌ๋œ ์„ค์ •๊ฐ’์„ ํ•„์ˆ˜๋กœ ์ž…๋ ฅํ•ด์•ผ ์ •์ƒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. + +### 1. ๋ธŒ๋žœ์น˜ ์ „๋žต (Branch Strategy) +์šฐ๋ฆฌ๋Š” Git Flow ์ „๋žต์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์šด์˜ํ•˜๋ฉฐ, main ๋ธŒ๋žœ์น˜์— ์ฝ”๋“œ๊ฐ€ ํ†ตํ•ฉ๋  ๋•Œ๋งŒ ์‹ค์ œ ์„œ๋ฒ„ ๋ฐฐํฌ๊ฐ€ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค. +| ๋ธŒ๋žœ์น˜ ์ด๋ฆ„ | ์—ญํ•  | ๋ฐฐํฌ ์—ฌ๋ถ€ | ๋น„๊ณ  | +| :--- | :--- | :---: | :--- | +| **`main`** | **์šด์˜(Production) ํ™˜๊ฒฝ** | **O (์ž๋™)** | ๋ฐฐํฌ ์‹œ์ : PR Merge ์งํ›„ | +| **`develop`** | **๊ฐœ๋ฐœ(Development) ํ†ตํ•ฉ** | X | ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ ํ›„ ํ†ตํ•ฉ ํ…Œ์ŠคํŠธ ์šฉ๋„ | +| `feature/*` | ๊ฐœ๋ณ„ ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ | X | `develop`์—์„œ ๋ถ„๊ธฐํ•˜์—ฌ ์ž‘์—… | +| `hotfix/*` | ์šด์˜ ์ด์Šˆ ๊ธด๊ธ‰ ์ˆ˜์ • | O | `main`์—์„œ ๋ถ„๊ธฐ, Merge ํ›„ ์ฆ‰์‹œ ๋ฐฐํฌ (์‚ฌ์šฉ ๊ถŒ์žฅ X)| + +--- + +### 2. CI/CD ํŒŒ์ดํ”„๋ผ์ธ (Deployment Pipeline) + +๋ฐฐํฌ ์ž๋™ํ™”๋Š” **GitHub Actions**๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ์˜ค์ง `main` ๋ธŒ๋žœ์น˜์— `push` ์ด๋ฒคํŠธ๊ฐ€ ๋ฐœ์ƒํ•  ๋•Œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. + +### ๐Ÿ”„ ๋ฐฐํฌ ํ๋ฆ„ (Workflow) +1. **Trigger**: `develop` โ†’ `main`์œผ๋กœ PR์ด Merge ๋˜๋ฉด ์›Œํฌํ”Œ๋กœ์šฐ๊ฐ€ ์‹œ์ž‘๋ฉ๋‹ˆ๋‹ค. +2. **Build & Push**: + * ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Docker ์ด๋ฏธ์ง€๋ฅผ ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค. + * ์ด๋ฏธ์ง€ ํƒœ๊ทธ๋Š” `latest`์™€ `Git Commit Hash` ๋‘ ๊ฐ€์ง€๋กœ ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. + * Docker Hub์˜ ํŒ€/์กฐ์ง ๋ ˆํฌ์ง€ํ† ๋ฆฌ๋กœ Push ๋ฉ๋‹ˆ๋‹ค. +3. **Deploy (Helm Upgrade)**: + * GitHub Actions๊ฐ€ ์šด์˜ ์„œ๋ฒ„(`farm8`)์— SSH๋กœ ์ ‘์†ํ•ฉ๋‹ˆ๋‹ค. + * `helm upgrade` ๋ช…๋ น์–ด๋ฅผ ํ†ตํ•ด Kubernetes ๋ฐฐํฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. + * **Key Config**: `--set image.pullPolicy=Always` ์˜ต์…˜์„ ํ†ตํ•ด ํ•ญ์ƒ ์ตœ์‹  ์ด๋ฏธ์ง€๋ฅผ ๋‹ค์šด๋กœ๋“œ ๋ฐ›๋„๋ก ๊ฐ•์ œํ•ฉ๋‹ˆ๋‹ค. + +--- + +## 3. ์ž‘์—… ๋ฐ ๋ฐฐํฌ ๊ทœ์น™ (Workflow Rules) + +ํŒ€์› ๊ฐ„ ์ถฉ๋Œ์„ ๋ฐฉ์ง€ํ•˜๊ณ  ์•ˆ์ •์ ์ธ ๋ฐฐํฌ๋ฅผ ์œ„ํ•ด ์•„๋ž˜ ์ ˆ์ฐจ๋ฅผ ์ค€์ˆ˜ํ•ด ์ฃผ์„ธ์š”. + +### ๐Ÿ›  ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ (Feature) +1. ๋ณธ์ธ์ด ์ƒ์„ฑํ•œ Github ์ด์Šˆ ๋ฒˆํ˜ธ์— ๋งž์ถฐ `develop` ๋ธŒ๋žœ์น˜์—์„œ `feature/#๊ธฐ๋Šฅ๋ฒˆํ˜ธ-๊ธฐ๋Šฅ๋ช…` ๋ธŒ๋žœ์น˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (e.g. feat/#155-scheduler) +3. ๋กœ์ปฌ์—์„œ ๊ฐœ๋ฐœ ๋ฐ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. +4. ์ปค๋ฐ‹ ๋ฉ”์‹œ์ง€ ์–‘์‹: [๋ถ„๋ฅ˜] #issue ์„ค๋ช… (e.g. `[feat] #4 ๋ฉ”์ธ ๊ธฐ๋Šฅ ๋งŒ๋“ค๊ธฐ`) +6. ์ž‘์—…์ด ์™„๋ฃŒ๋˜๋ฉด `feature` โ†’ `develop` ๋ธŒ๋žœ์น˜๋กœ Pull Request(PR)๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. + +### ๐Ÿš€ ์ •๊ธฐ ๋ฐฐํฌ (Release) +1. `develop` ๋ธŒ๋žœ์น˜์— ์ถฉ๋ถ„ํ•œ ๊ธฐ๋Šฅ์ด ๋ชจ์ด๊ณ  ํ…Œ์ŠคํŠธ๊ฐ€ ์™„๋ฃŒ๋˜๋ฉด ๋ฐฐํฌ๋ฅผ ์ค€๋น„ํ•ฉ๋‹ˆ๋‹ค. +2. PR ์ œ๋ชฉ: `[deploy] develop -> main (๋˜๋Š” ๋ถ€๊ฐ€ ์„ค๋ช…)` **`develop` โ†’ `main`** ์œผ๋กœ PR์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. +3. ์ฝ”๋“œ ๋ฆฌ๋ทฐ(Approve) ํ›„ Merge ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด, **์ฆ‰์‹œ ์šด์˜ ์„œ๋ฒ„์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.** ์ตœ์†Œ ํ•œ ๋ช… ์ด์ƒ์˜ Approve๋ฅผ ๋ฐ›์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. + +--- + +## 4. API ๋ฌธ์„œ ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง + +์„œ๋ฒ„๊ฐ€ ์ •์ƒ์ ์œผ๋กœ ์‹คํ–‰ ์ค‘์ผ ๋•Œ, ์•„๋ž˜ ์ฃผ์†Œ์—์„œ API ๋ช…์„ธ(Swagger)๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. + +* **Swagger UI**: `http://{farm_server_ip}:{port}/apidocs/` +* **Health Check**: `http://{farm_server_ip}:{port}/health` + +> **์ฐธ๊ณ **: NodePort๋Š” `values.yaml` ์„ค์ •์— ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. + +--- + +## 5. ํŠธ๋Ÿฌ๋ธ”์ŠˆํŒ… (Troubleshooting) + +๋ฐฐํฌ ํ›„ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ํ™•์ธ ๋ฐ ์กฐ์น˜ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. + +### 1. Pod ์ƒํƒœ ํ™•์ธ +```bash +kubectl get pods -n cssh +``` +- ์ •์ƒ: Running (READY 1/1) +- ์˜ค๋ฅ˜: CrashLoopBackOff, ImagePullBackOff, Pending + +### 2. ๋กœ๊ทธ ํ™•์ธ + +์„œ๋ฒ„๊ฐ€ ๋œจ์ง€ ์•Š๊ฑฐ๋‚˜ ๋™์ž‘์ด ์ด์ƒํ•  ๋•Œ ์‹ค์‹œ๊ฐ„ ๋กœ๊ทธ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. +```bash +# Pod ์ด๋ฆ„ ํ™•์ธ ํ›„ +kubectl logs -f -n cssh +``` + +## 6. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ๋ฐ ์‹œํฌ๋ฆฟ +CI/CD ์ž‘๋™์„ ์œ„ํ•ด GitHub Repository Secrets์— ๋‹ค์Œ ๋ณ€์ˆ˜๋“ค์ด ๋“ฑ๋ก๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. +- Docker Hub: `DOCKER_USERNAME`, `DOCKER_PASSWORD` +- Kubernetes Access: `K8S_HOST`, `K8S_USERNAME`, `K8S_PRIVATE_KEY`, `K8S_PORT` +> ํ˜„์žฌ๋Š” username์ด toni์™€ {key}๋กœ ๋˜์–ด์žˆ์œผ๋ฉฐ, ๊ด€๋ฆฌ์ž ๋ณ€๊ฒฝ ์‹œ ์ธ์ˆ˜์ธ๊ณ„๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. + +## ๐Ÿ“š ๋ฌธ์„œ ๋ฐ ์œ„ํ‚ค +๋” ์ž์„ธํ•œ ๊ฐœ๋ฐœ ๊ฐ€์ด๋“œ์™€ ํŠธ๋Ÿฌ๋ธ”์ŠˆํŒ… ๋กœ๊ทธ๋Š” **GitHub Wiki**๋ฅผ ์ฐธ๊ณ ํ•ด ์ฃผ์„ธ์š”. + +> ๊ด€๋ จ ๋งํฌ
+> [์ธํ”„๋ผ ์„œ๋ฒ„](https://github.com/CSID-DGU/admin_infra)
+> [ํ”„๋ก ํŠธ์—”๋“œ](https://github.com/CSID-DGU/AILab-FE)
+