[AWS_3] AWS RDS

2019-3-6 Wed 17:18

AWS

AWS RDS

개괄

관계형 DB를 더 쉽게 설치, 운영, 확장할 수 있는 웹서비스
종류: MySQL, MariaDB, PostgreSQL, Aurora, Oracle, MS SQL Server
Aurora
- 기존: Storage & Computation 분리 + Instance 중심
- 개선: Cluster 중심의 구성 + S3활용
- 용량 64TB까지 증가
  - MySQL: 20GB까지 증가. 이후 Application 중단
- 물리적으로 wirter/reader가 참조하는 디스크 동일(논리적으로 다름)
  - vs MySQL: disk 나눔
생성: service > RDS
MySQL connection(linux workbench: jdbc필요없음)
- Connection Name: writer instance ID
- Hostname: end point
- Port: 3306
- Username: Mastt #Master name
Backup/Recovery
- Backup: Action > Take snapshot
- Recovery: Snapshots
Scale-up: Modify
Scale-out: add reader
cache 서비스: 자주 접근하는 데이터를 cache서버에 저장 (more…)
- 예시: 게임 데이터 분석 아키텍처 (more…)
Redshift: Relational data warehouse (more…)

Aurora DB

Aurora brings a novel architecture to the relational database to address this constraint, most notably by pushing redo processing to a multi-tenant scaleout storage service, purpose-built for Aurora. We describe how doing so not only reduces network traffic, but also allows for fast crash recovery, failovers to replicas without loss of data, and fault-tolerant, self-healing storage.(“Amazon Aurora: Design Considerations for HighThroughput Cloud-Native Relational Databases“)

AWS Aurora 생성시 DB 클러스터 생성 (more…)
- DB클러스터는 하나 이상의 인스턴스 & 데이터 관리하는 클러스터 볼륨으로 구성
- Aurora 클러스터 볼륨은 Multiple AZ 아우르는 가상 DB Storage Volume으로서, 각 AZ에는 클러스터 Data 사본이 복사됨
- 기본primary DB 인스턴스
  - read/write & data modification to the cluster volume
- Aurora Replica
  - only read & Connect to the same storage volume as the primary DB instance
  - up to 15 Aurora Replicas
특징: Multi-masters
- Multi AZ에 Multi reads/writes master instance 생성 가능
- master instance 중 하나에 장애(또는 특정 AZ 전체 장애) 발생 -> cluster의 다른 instance가 즉시 이어받아 중단 안됨
  - 즉 마스터 DB서버에 장애 발생시, cluster 안의 다른 서버가 이를 즉시 대체
SQL vs Aurora 비교 (more…)
- MySQL: reader별로 디스크가 다름
- Aurora: wirter/reader 참조 디스크가 하나(Cluster Volume)

용어

ETL: Extract - Transform - Load
- eg. extract: Y/M/D/t/m column using SQL select-> transform: Y-M-D(tm) -> load, insert
Transaction: data 내용에 영향을 미치는 거래/입출고/저장 등의 단위행위
OLTP(Online Transaction Processing)
- 업무기반(원시 데이터 활용 가공/분석), 금융 전산 부문에서 자주 사용
- 여러 과정의 연산이 하나의 단위 프로세스(Trasaction) 로 실행
  - 네트워크 상 여러 이용자 실시간 DB data 갱신/조회의 단위 작업 처리
- 다수 이용자 거의 동시 이용할 수 있도록, 송수신 자료를 Transaction 단위로 압축, 비어있는 공간을 다른 사용자에게 할당
OLAP(Online Analytical Processing)
- 분석기반
- 대용량 데이터 고속 처리
- 사용자가 데이터 직접 접근
Cluster: 디스크로부터 데이터를 읽어오는 시간을 줄이기 위해, 조인이나 자주 사용되는 테이블의 데이터를 디스크의 같은 위치 에 저장시키는 방법
- Aurora (more…)
- 장점: grouped된 컬럼 data 행들이 같은 data Block에 저장되므로 디스크 I/O 단축
Replication
- MySQL Replication
- 개념: scale-out(서버 추가)의 한 방법.
- 어떤 환경에서 활용? hight Num of reads / low Num of writes&updates
- 어떻게 활용? 읽기 분산시킴(distribute the reads over the replication slaves, while still enabling your web servers to communicate with the replication master when a write is required. )
Master/Slave $\approx$ Manager/Worker: Aurora Cluster 구조에서 Worker가 부족하다 싶으면 reader 추가 + 추가 + 추가…
계층형 쿼리(Hierarchical Query VS Relational)
- 오라클에서만 지원

기타

Dark data: 더이상 사용하지 않는 데이터 -> 정리 필 쟝소
메타 시스템: 테이블 생성 절차 및 사용하지 않는 데이터 어떻게 처리할지
Query
- 모집단, 기준 을 잡고 우측에 붙여나가야 됨
- count 무조건 사용해서 모집단이 1000명이면 join 할 때 2000명… 등으로 달라지지 않도록 유의
inner join/ left outer join/ left anti
replica 서버 는 master와 동급이거나 높은 사양으로 맞춰야 함(과부하 방지)

참조

Henry's blog

Step by step

[AWS_3] AWS RDS

AWS RDS

개괄

Aurora DB

용어

기타