docker - Overriding `tcp.publish_port` breaks clustering when elasticsearch is in a container -


i'm trying run elasticsearch cluster each es-node running in own container. these containers deployed using ecs across several machines may running other unrelated containers. avoid port conflicts each port container exposes assigned random value. these random ports consistent across running containers of same type. in other words, running es-node containers map port 9300 same random number.

here's config i'm using:

network:   host: 0.0.0.0  plugin:   mandatory: cloud-aws  cluster:   name: ${es_cluster_name}  discovery:   type: ec2   ec2:     groups: ${es_security_group}     any_group: false   zen.ping.multicast.enabled: false  transport:   tcp.port: 9300   publish_port: ${_instance_port_transport}  cloud.aws:   access_key: ${aws_access_key}   secret_key: ${aws_secret_key}   region: ${aws_region} 

in case _instance_port_transport port 9300 bound on host machine. i've confirmed environment variables used above set correctly. i'm setting network.publish_host host machine's local ip via command line arg.

when forced _instance_port_transport (and in turn transport.publish_port) 9300, worked great, it's given random value, nodes can no longer connect each other. see errors using logger.discovery=trace:

connecttransportexception[[][10.0.xxx.xxx:9300] connect_timeout[30s]]; nested: connectexception[connection refused: /10.0.xxx.xxx:9300];     @ org.elasticsearch.transport.netty.nettytransport.connecttochannelslight(nettytransport.java:952)     @ org.elasticsearch.transport.netty.nettytransport.connecttonode(nettytransport.java:916)     @ org.elasticsearch.transport.netty.nettytransport.connecttonodelight(nettytransport.java:888)     @ org.elasticsearch.transport.transportservice.connecttonodelight(transportservice.java:267)     @ org.elasticsearch.discovery.zen.ping.unicast.unicastzenping$3.run(unicastzenping.java:395)     @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142)     @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617)     @ java.lang.thread.run(thread.java:745) 

it seems port node binds same port pings while trying connect other nodes. there way make them different? if not, what's point of transport.publish_port?

the way discovery-ec2 plugin works it's collecting list of ip addresses using aws ec2 api , use list unicast list of nodes.

but not collect information running cluster. node not yet connected! not know publish_port of other nodes.

it adds ip address. , that's all. elasticsearch is using default port 9300.

so there nothing can imo fix in short time.

but can imagine adding new feature close has been implemented google compute engine. using specific metadata port gce apis.

we same azure , ec2. want open issue can track effort?