Newer
Older
## VOSpace backend
### Introduction
This repository hosts the code of the VOSpace backend.
This is a dockerized version including all the VOSpace components which can be directly executed on your laptop.
The VOSpace implementation is composed by several parts. Each of these parts is represented by one of the [following repositories](https://www.ict.inaf.it/gitlab/vospace).
For a production-like demo, please refer to the [vospace-demo](https://www.ict.inaf.it/gitlab/vospace/vospace-demo) repository or simply visit [this page](http://staging.ia2.inaf.it/) and try it out.
For more information about the VOSpace specification, please refer to:
- [IVOA Documents & Standards](https://www.ivoa.net/documents/)
- [VOSpace standard v2.1](https://www.ivoa.net/documents/VOSpace/20180620/REC-VOSpace-2.1.html)
- [Universal Worker Service Pattern v1.1](https://www.ivoa.net/documents/UWS/20161024/REC-UWS-1.1-20161024.html)
Further documentation on the VOSpace implementation can be found [here](https://redmine.ict.inaf.it/projects/401/wiki).
### Main features
- Recursive scan, checksum calculation and .tar generation for data provided by users
- Data transfer operations from and to hot and cold storage points
- Database interaction for storing and retrieving information about VOSpace nodes, jobs, storage points and users
- Simple FCFS (First Come First Served) scheduling mechanism for jobs based on Redis lists
- Set of command line tools to simplify the administrator interaction with the backend architecture.
### Getting started
First of all clone the repository on your local Linux machine, open a terminal and place yourself within the *vospace-transfer-service* folder.
You can launch the whole environment by running the following commands (as user):
```
docker-compose pull
docker-compose up
```
The web interface will be available on your browser at http://localhost:8080/ once all the containers are up and running.
To stop the environment and perform a cleanup, launch the following commands from another shell:
```
docker-compose down
docker system prune -a
docker volume prune
```
### Components
- Client (container_name: client), is used to provide to the user some useful command line tools to interact with the backend
- Transfer service (container_name: transfer_service) is the core of the backend architecture
Cristiano Urban
committed
- Redis (container_name: cache), is used as a cache for job queues, as message broker to deliver messages containing requests from the user command line tools and from the VOSpace REST APIs and also for basic centralized logging
- File catalog (container_name: file_catalog), now available [here](https://www.ict.inaf.it/gitlab/vospace/vospace-file-catalog), is a posgresql database used to store information on VOSpace nodes, but also on storage locations, jobs and users.
#### Client
You can access the *client* container with:
```
docker exec -it client bash
```
The *client* container provides the following command line tools:
- **vos_data**: launches a job to automatically store data provided by the user on a given storage point (hot or cold)
- **vos_group**: allows to add/list/remove users/groups to/from 'group_read' and/or 'group_write' of a VOSpace node and its child nodes (if any)
- **vos_user**: allows to add/remove users to/from the database
- **vos_import**: imports VOSpace nodes on the file catalog for data already stored on a given storage point
- **vos_job**: provides information about jobs
- **vos_storage**: allows to add/list/remove storage points.
You can launch each of these commands without any argument to see their help page.
#### Transfer service
The transfer service is the core of the VOSpace backend architecture.
You can access the *transfer_service* container with:
```
docker exec -it transfer_service bash
```
On this container, hosted on the so-called transfer node, each user will find his/her own home folder and two subfolders representing respectively the entry point and exit point for the user data:
- */home/name.surname/vospace_store*
- */home/name.surname/vospace_retrieve*
The user will copy the data to be stored within the *vospace_store* folder, while he/she will find the requested data within the *vospace_retrieve* folder.
This use case was implemented in order to try to offer support to users providing huge amounts of data in the order of terabytes.
Log files can be found in */var/log/vos_ts*.
Report files for storage and import operations can be found in */var/log/vos_ts/results*.
#### Redis
You can access the Redis server from the **client** container by following the steps here below.
1. Execute an interactive bash shell on the client container:
```
docker exec -it client bash
```
2. Use *redis-cli* command to connect to Redis:
```
Cristiano Urban
committed
redis-cli -h cache
```
3. You can obtain some info about the jobs by searching them on the following queues:
- for write operations the queues are *write_pending*, *write_ready* and *write_terminated*
- for read operations the queues are *read_pending*, *read_ready* and *read_terminated*
- for import operations the queues are *import_ready* and *import_terminated*
- for group operations the queues are *group_rw_ready* and *group_rw_terminated*
4. Some of the messages sent by the VOSpace REST APIs are handled by the following queues:
- for starting async recalls the queue is *start_job_queue*
5. The messages sent by the user command line tools are handled by the following queues:
- for vos_data the queue is *data_queue*
- for vos_import the queue is *import_queue*
- for vos_job the queue is *job_queue*
Example: list the first six elements of the *write_ready* job queue
```
redis:6379> lrange write_ready 0 5
```
By default, [AOF + RDB persistence](https://redis.io/topics/persistence) is enabled.
You can see the content of the AOF file with:
```
cat /data/appendonly.aof
```
#### File catalog
You can access the file catalog from the **client** container by following the steps here below.
1. Execute an interactive bash shell on the client container:
```
docker exec -it client bash
```
2. Access the database via *psql* client:
```
psql -h file_catalog -d vospace_testdb -U postgres
```
3. You can now perform a query, for example show all the tuples of the **node** table displaying some fields:
```
vospace_testdb=# SELECT node_id, path, name, parent_path, type, creator_id, content_md5, async_trans, sticky FROM node;
```
You can perform whatever query also on the other tables: *deleted_node*, *storage*, *location*, *job* and *users*.