Skip to content
README.md 6.54 KiB
Newer Older
## VOSpace backend

### Introduction

This repository hosts the code of the VOSpace backend.
This is a dockerized version including all the VOSpace components which can be directly executed on your laptop.

The VOSpace implementation is composed by several parts. Each of these parts is represented by one of the [following repositories](https://www.ict.inaf.it/gitlab/vospace).

For a production-like demo, please refer to the [vospace-demo](https://www.ict.inaf.it/gitlab/vospace/vospace-demo) repository or simply visit [this page](http://staging.ia2.inaf.it/) and try it out.

For more information about the VOSpace specification, please refer to:
- [IVOA Documents & Standards](https://www.ivoa.net/documents/)
- [VOSpace standard v2.1](https://www.ivoa.net/documents/VOSpace/20180620/REC-VOSpace-2.1.html)
- [Universal Worker Service Pattern v1.1](https://www.ivoa.net/documents/UWS/20161024/REC-UWS-1.1-20161024.html)

Further documentation on the VOSpace implementation can be found [here](https://redmine.ict.inaf.it/projects/401/wiki).

### Main features

- Recursive scan, checksum calculation and .tar generation for data provided by users
Cristiano Urban's avatar
Cristiano Urban committed
- Data transfer operations from and to hot and cold storage points
- Database interaction for storing and retrieving information about VOSpace nodes, jobs, storage points and users
- Simple FCFS (First Come First Served) scheduling mechanism for jobs based on Redis lists
- Set of command line tools to simplify the administrator interaction with the backend architecture.


### Getting started

First of all clone the repository on your local Linux machine, open a terminal and place yourself within the *vospace-transfer-service* folder.
You can launch the whole environment by running the following commands (as user):

```
docker-compose pull
docker-compose up
```
The web interface will be available on your browser at http://localhost:8080/ once all the containers are up and running.

To stop the environment and perform a cleanup, launch the following commands from another shell:

```
docker-compose down
docker system prune -a
docker volume prune
```

### Components

- Client (container_name: client), is used to provide to the user some useful command line tools to interact with the backend
- Transfer service (container_name: transfer_service) is the core of the backend architecture
- Redis (container_name: cache), is used as a cache for job queues, as message broker to deliver messages containing requests from the user command line tools and from the VOSpace REST APIs and also for basic centralized logging
- File catalog (container_name: file_catalog), now available [here](https://www.ict.inaf.it/gitlab/vospace/vospace-file-catalog), is a posgresql database used to store information on VOSpace nodes, but also on storage locations, jobs and users.


#### Client

Cristiano Urban's avatar
Cristiano Urban committed
You can access the *client* container with:
```
docker exec -it client bash
```
The *client* container provides the following command line tools:

- **vos_data**: launches a job to automatically store data provided by the user on a given storage point (hot or cold)
Cristiano Urban's avatar
Cristiano Urban committed
- **vos_group**: allows to add/list/remove users/groups to/from 'group_read' and/or 'group_write' of a VOSpace node and its child nodes (if any)
Cristiano Urban's avatar
Cristiano Urban committed
- **vos_user**: allows to add/remove users to/from the database
- **vos_import**: imports VOSpace nodes on the file catalog for data already stored on a given storage point
- **vos_job**: provides information about jobs
- **vos_storage**: allows to add/list/remove storage points.

You can launch each of these commands without any argument to see their help page.

#### Transfer service

The transfer service is the core of the VOSpace backend architecture.

You can access the *transfer_service* container with:
```
docker exec -it transfer_service bash
```
On this container, hosted on the so-called transfer node, each user will find his/her own home folder and two subfolders representing respectively the entry point and exit point for the user data:
- */home/name.surname/vospace_store*
- */home/name.surname/vospace_retrieve*
The user will copy the data to be stored within the *vospace_store* folder, while he/she will find the requested data within the *vospace_retrieve* folder.
This use case was implemented in order to try to offer support to users providing huge amounts of data in the order of terabytes.

Cristiano Urban's avatar
Cristiano Urban committed
Log files can be found in */var/log/vos_ts*.

Report files for storage and import operations can be found in */var/log/vos_ts/results*.
Cristiano Urban's avatar
Cristiano Urban committed

#### Redis

You can access the Redis server from the **client** container by following the steps here below.

1. Execute an interactive bash shell on the client container:
```
docker exec -it client bash
```
2. Use *redis-cli* command to connect to Redis:
```
```
3. You can obtain some info about the jobs by searching them on the following queues:
Cristiano Urban's avatar
Cristiano Urban committed
   - for write operations the queues are *write_pending*, *write_ready* and *write_terminated*
Cristiano Urban's avatar
Cristiano Urban committed
   - for read operations the queues are *read_pending*, *read_ready* and *read_terminated*
Cristiano Urban's avatar
Cristiano Urban committed
   - for import operations the queues are *import_ready* and *import_terminated*
   - for group operations the queues are *group_rw_ready* and *group_rw_terminated*
Cristiano Urban's avatar
Cristiano Urban committed
4. Some of the messages sent by the VOSpace REST APIs are handled by the following queues:
Cristiano Urban's avatar
Cristiano Urban committed
   - for starting async recalls the queue is *start_job_queue*
Cristiano Urban's avatar
Cristiano Urban committed
5. The messages sent by the user command line tools are handled by the following queues:
   - for vos_data the queue is *data_queue*
Cristiano Urban's avatar
Cristiano Urban committed
   - for vos_group the queue is *group_queue*
Cristiano Urban's avatar
Cristiano Urban committed
   - for vos_user the queue is *user_queue*
Cristiano Urban's avatar
Cristiano Urban committed
   - for vos_import the queue is *import_queue*
   - for vos_job the queue is *job_queue*
Cristiano Urban's avatar
Cristiano Urban committed
   - for vos_storage the queue is *storage_queue*
Cristiano Urban's avatar
Cristiano Urban committed
Example: list the first six elements of the *write_ready* job queue
```
redis:6379> lrange write_ready 0 5
```
Cristiano Urban's avatar
Cristiano Urban committed
By default, [AOF + RDB persistence](https://redis.io/topics/persistence) is enabled.
Cristiano Urban's avatar
Cristiano Urban committed
You can see the content of the AOF file with:
```
cat /data/appendonly.aof
```

#### File catalog

You can access the file catalog from the **client** container by following the steps here below.

1. Execute an interactive bash shell on the client container:
```
docker exec -it client bash
```
2. Access the database via *psql* client:
```
psql -h file_catalog -d vospace_testdb -U postgres
```
3. You can now perform a query, for example show all the tuples of the **node** table displaying some fields:
```
vospace_testdb=# SELECT node_id, path, name, parent_path, type, creator_id, content_md5, async_trans, sticky FROM node;
```

You can perform whatever query also on the other tables: *deleted_node*, *storage*, *location*, *job* and *users*.