Seoul AI is hosting our 4th AI hackathon on Saturday, 15th of December. This hackathon is based on a new toolkit from SeoulAI for developing AI algorithms: Seoul AI Gym. This gym simulates various environments and enables the use of any teaching techniques on an agent. The goal of every hackathon participant is to develop an agent that can trade cryptocurrency. Each participant will train an agent to compete to become the most successful trader. Participants are free to choose any machine learning algorithm of their choice. SeoulAI will provide the real-time dataset.


All agents will receive virtual 100,000,000KRW at 10:00. The agent with the highest revenue by 18:50 will be the winner of the competition.

We encourage you to participate in the competition from the start of the competition, but you can join at any time on December 15th.


The winner will receive Apple AirPods.


All participants must register their hackathon_id (agent_id) through this Form

If you have any questions, feel free to ask us at seoul.ai.global@gmail.com

Judging Criteria

Important information


Saturday, 15th of December, 2018


Hyperconnect 14th floor, Daegak Bldg., 5, Seocho-daero 78-gil, Seocho-gu, Seoul, Republic of Korea



Technical information

You can check our latest documentation on GitHub. If you encounter any issues, please create an issue on GitHub Issues and we will try to take care of it as soon as possible.

You will need Python v 3.6 to run the Seoul AI Market environment.


install from source.

virtualenv -p python3.6 your_virtual_env
source your_virtual_env/bin/activate

git clone -b market https://github.com/seoulai/gym.git

cd gym

pip3 install -e .

Seoul AI Market Framework

Seoul AI Market is based on a real-time reinforcement learning.

import seoulai_gym as gym
from itertools import count
from seoulai_gym.envs.market.agents import Agent
from seoulai_gym.envs.market.base import Constants

class YourAgentClassName(Agent):

if __name__ == "__main__":
    your_id = "seoul_ai"
    mode = Constants.TEST

    # Define your actions.    
    your_actions = dict(
        holding = 0,
        buy_1 = +1,
        sell_2 = -2,

    # Create your agent.
    a1 = YourAgentClassName(
    # Create your market environment.
    env = gym.make("Market")
    # Select your id and mode to participate
    env.participate(your_id, mode)
    # reset fetches the initial state of the crypto market.
    obs = env.reset()
    # Perform real-time reinforcement learning
    for t in count():
        # Call act for the agent to take an action
        action = a1.act(obs)
        # To send your action to market:  
        next_obs, rewards, done, _ = env.step(**action)
        # It is recommended that reward override and user-defined function usage be done via postprocess function.
        a1.postprocess(obs, action, next_obs, rewards)

Setup details


env.reset() in TEST mode

your_id = "seoul_ai"
mode = Constants.TEST

env = gym.make("Market")
env.participate(your_id, mode)

# IF you call reset in TEST mode, your cash and balance will be updated to 100,000,000 KRW and 0.0 respectively.
obs = env.reset()

env.reset() in HACKATHON mode

your_id = "seoul_ai"
mode = Constants.HACKATHON

env = gym.make("Market")
env.participate(your_id, mode)

# Calling reset in HACKATHON mode fetches the cash and balance
# It is different from calling reset in TEST mode as your cash and balance will not be reset.
obs = env.reset()


action = a1.act(obs)

The act function calls the following functions.


The step function fetches and saves the crypto market state so that the agent may start trading. This function returns 3 variables. (obs, rewards and done)


obs is short for observation. The datasets in obs are as follows:

order_book = obs.get("order_book")    # {timestamp, ask price, ask_size, bid price, bid size}
trade = obs.get("trade")    # {timestamp, price, volume, ask_bid, sid} (the last 200 time series data)
agent_info = obs.get("agent_info")    # {cash, balance amount}
portfolio_rets = obs.get("portfolio_rets")    # {portfolio indicators based on algorithm performance}


There are 8 types of rewards

rewards = dict(
    return_amt=return_amt,    # Return(amount) from current action
    return_per=return_per,    # Return(%) from current action
    return_sign=return_sign,    # 1 if profited from current action. -1 if loss. else 0
    fee=fee,    # Fee from current action
    hit=hit,    # 1 if buy and price goes up or sell and price goes down. else 0.
    real_hit=real_hit,    # Fee based hit
    score_amt=score_amt,    # Return(amount) from initial cash (100,000,000 KRW)
    score=score)    # Return(%) from initial cash (100,000,000 KRW) = hackathon score



The value of done is always False.

Developing Agent class

Creating your agent

import seoulai_gym as gym
from seoulai_gym.envs.market.agents import Agent

# Your agent must inherit from Seoul AI's agent class
class YourAgentClassName(Agent):

    # You need to develop 4 functions.

init(set actions)

All participants must define the actions dictionary.

def __init__(
    agent_id: str,

    """ Actions Dictionary
        key = action name, value = order parameters
        Use any action name
        Buy order'll be concluded at the first sell price
        Sell order'll be concluded at the first buy price.
        The probability of conclusion'll be 100%.

    your_actions = dict(
        # You have to define holding action!
        holding = 0,

        # + means buying, - means selling.
        buy_1 = +1,    # buy_1 means that you will buy 1 bitcoin.
        sell_2 = -2,  # sell_2 means that you will sell 2 bitcoin.

        # 4th decimal place
        buy_1_2345 = +1.2345,
        sell_2_001 = -2.001,

        # You can define actions by %. However, integer between -100 and 100 must be entered.
        buy_all = (+100, '%'),    # buy_all means that you will buy 100% of the purchase amount
        sell_20per = (-20, '%'),    # sell_20per means you will sell 20% of the available volume
    super().__init__(agent_id, your_actions)


You can select your data from raw data (fetched by obs), and change it as you’d like. We encourage that you perform data normalization.

    def preprocess(
        # get data
        trades = obs.get("trade")

        # make your own data!
        price_list = trades.get("price")
        cur_price = price_list[0]
        price10 = price_list[:10]

        ma10 = np.mean(price10)
        std10 = np.std(price10)
        thresh_hold = 1.0

        # obs -> state
        your_state = dict(
            buy_signal=(cur_price > ma10 + std10*thresh_hold),
            sell_signal=(cur_price < ma10 - std10*thresh_hold),

        return your_state

algo (algorithm definition)

It is a function that defines the conditions for trading. algo function must return self.action function.

    def algo(
        if state["buy_signal"]:
            # Enter action_name as a parameter of actions dictionary.
            return self.action("buy_all")
        elif state["sell_signal"]:
            return self.action("sell_20per")
            return self.action(0)    # You can enter the index of the action_name as a parameter.


You can select and redifine reward through the postprocess.

    def postprocess(
        your_reward = 0

        # Select
        your_rewards = rewards.get("hit")

        # Redefine
        trades = obs.get("trade")
        next_trades = next_obs.get("trade")

        cur_price = trades["price"][0]
        next_price = next_trades["price"][0]

        change_price = (next_price-cur_price)

        your_reward = np.sign(change_price)

DQN Example


Rule based Example


Random Example